CIS 120 OCaml Style Guide

One of the goals of this course is to teach you how to program elegantly. Well-written code is easy to understand, replicate, and debug.

Just as there are a variety of spoken and written languages, every programming language has a unique syntax with grammatical rules and a desired style. Some elements of style - like giving variables descriptive names - are the same for every language.

Listed below are CIS 120’s style guidelines for OCaml. Some of these are general programming style guidelines and some are specific to OCaml. Your homework assignments will often contain some autograded and manually graded style points and writing well-written code takes practice, so take the time to think about to improve your code before you submit each homework assignment.


File Submission Requirements:

Commenting:

Naming and Declarations:

Indentation:

Pattern Matching:

Verbosity:

Spaces around Operators:


Code

80 character limit

No line of code should have more than 80 characters or columns.

Use line breaks

Use line breaks to separate different types of declarations – such as global structures, types, exceptions, and values. Use line breaks between all function declarations. Avoid line breaks within a let block. There should never be an empty line within an expression.

For example:

    let weekend_day =
        | Friday
        | Saturday
        | Sunday

    (* line break here to separate the weekend_day type definition from the weekend_day variable *)

    (* no line breaks needed between two variable declarations *)
    let may_22_2020 : weekend_day = Friday
    let may_23_2020 : weekend_day = Saturday

    (* line break here to separate the function from the rest of the code *)

    let tomorrow (today : weekend_day) : weekend_day =
        begin match today with
            | Friday -> Saturday
            | Saturday -> Sunday
            | Sunday -> failwith "rip"
        end

Delete commented-out code

Delete all commented-out code from your final submission. Commented-out code can be a useful reference when you are working on a project, but your final deliverable should only include runnable code.

Comments

Comment above referenced code

Write your comments above the code you want to comment on.

For example:

    (* Sums a list of integers. *)
    let sum = List.fold_left (+) 0

Comment when appropriate

Assume the reader is a student at your skill level. Write clear, concise comments only where they would benefit the reader.

Avoid Over-commenting

Incredibly long comments are not very useful. Long comments should only appear at the top of a file – here you should explain the overall design of the code and reference any sources that have more information about the algorithms or data structures. All other comments in the file should be as short as possible – after all brevity is the soul of wit. Most often the best place for any comment is just before a function declaration. Rarely should you need to comment within a function – variable naming should be enough.

Proper Multi-line Commenting

When comments are printed on paper, the reader lacks the advantage of color highlighting performed by an editor such as Emacs. This makes it important for you to distinguish comments from code. When a comment extends beyond one line, it should be preceded with a * similar to the following:

    (* This is one of those rare but long comments
     * that need to span multiple lines because
     * the code is unusually complex and requires
     * extra explanation. *)
    let complicatedFunction () = ...

Write descriptive variable names

Variable names should be descriptive letters, syllables, words, or combinations of the three.

For example, any of the following would be descriptive variable names:

    (* Representing Monday as the first day of the week. *)
    let friday = 5
    let fri = 5
    let f = 5
    let sa = 6
    let su = 7

Below is an example of bad variable naming:

    let x = y :: z

z in this case is a list, and y is a singular element, but cons-ing y onto z makes it look like you’re cons-ing the same type onto each other, which can be misleading.

Use variable names consistently

Use consistent names for the structures in pattern-match statements and similar variables and functions.

For example, the following code snippet ignores this guideline and uses inconsistent naming for two functions, their arguments, and their pattern-match cases:

    let rec add_one (x : int list) =
        begin match x with
            | [] -> []
            | hd :: tl -> (hd + 1) :: tl
        end

    let rec add_two_to_list (l : int list) =
        begin match l with
            | [] -> []
            | x :: y -> (x + 2) :: y
        end

Naming conventions

Use the following naming conventions for each corresponding type of data:

Variables, functions, and user-defined types – all lower case with underscores for multi-word names (snake case):

    let eight : int = 8
    let add_one (x : int) : int = x + 1
    let boolean =
        | True
        | False

Constructors – initial upper case with embedded caps for multi-word names:

    let boolean_tuple =
        | TrueFalse
        | TrueTrue
        | FalseFalse
        | FalseTrue

Modules and Module Types– initial upper case with embedded caps for multi-word names:

    PriorityQueue

Type annotations

Include type annotations when they will improve readability, especially when you are writing complex or potentially ambiguous functions and values. Type annotations also help the compiler understand the desired behavior of the program and produce more useful warning messages.

For example:

    (* No type annotations. *)
    let add_one x = x + 1

    (* Type annotations. *)
    let add_one (x : int) : int = x + 1

Write descriptive test names

Write meaningful test names that clearly and concisely explain what the test is testing.

Indenting

Use Consistent Spacing

Choose a spacing convention and use it consistently throughout your program. For example, either use a+b or a + b, but do not randomly switch between the two styles.

Indent two spaces at a time – no tabs

When you need to indent a newline, do so by two spaces more than the preceding line of code. Do not use tab characters.

If you’re interested in using the Emacs package from the OCaml website – it avoids using tabs (with the exception of pasting text from the clipboard or kill ring). When in ml-mode, Emacs uses the TAB key to control indenting instead of inserting the tab character.

Indent Pattern-Match Expressions

Indent the contents of a pattern-match block. This helps the reader clearly identify the start and end of the pattern-matching block.

let get_head (l : int list) : int =
    begin match l with
      | [] -> failwith "empty list"
      | hd :: _ -> hd
    end

Indent If-Else Expressions

Use one of the following indentation patterns for if statements:

    if exp1 then exp2 else exp3

    if exp1 then exp2
    else exp3

    if exp1 then
      exp2
    else exp3

Use the following pattern for if-else-if statements:

    if exp1 then exp2
    else if exp3 then exp4
    else if exp5 then exp6
         else exp8

Use the following pattern for nested if statements:

    if exp1 then
      if exp2 then exp3 else exp4
    else exp5

Indent comments level to their code

Indent comments to the level of the line of code that follows the comment.

Don't Over-parenthesize

Only use parentheses when you need them to change the behavior of your code or when they are necessary for readability.

Pattern Matching

Complete Pattern-Matching

Always match on every possible match case, even if you’re confident a case won’t occur.

For example, even if a user would never call the following head function on an empty list, you should still write the match case for an empty list:

let head (l : int list) : int =
    begin match l with
        | [] -> failwith "empty list"
        | hd :: _ -> hd
    end

Pattern Match in the Function Arguments When Possible

Tuples, records and datatypes can be deconstructed using pattern matching. If you simply deconstruct the function argument before you do anything useful, it is better to pattern match in the function argument. Consider these examples:

    (* Bad *)
    let f arg1 arg2 =
      let x = fst arg1 in
      let y = snd arg1 in
      let z = fst arg2 in
      ...

    (* Good *)
    let f (x,y) (z,_) = ...

    (* Bad *)
    let f arg1 =
      let x = arg1.foo in
      let y = arg1.bar in
      let baz = arg1.baz in
      ...

    (* Good *)
    let f {foo=x, bar=y, baz} = ...

Pattern-Match Efficiently

Do not write four cases when two will do.

For example, the following can be simplified by deleting all but the first and last cases.

'a tree =
    | Empty
    | Node of 'a tree * 'a * 'a tree

let sum (t : 'a tree) : int =
    begin match t with
        | Empty -> 0
        | Node (Empty, n, Empty) -> n
        | Node (Empty, n, rt) -> n + sum rt
        | Node (lt, n, Empty) -> sum lt + n
        | Node (lt, n, rt) -> sum lt + n + sum rt
    end

Verbosity

Don't Rewrite Existing Code

In our homework assignments, we will ask you to write functions whose functionality you may have already partially implemented. Whenever you write a useful function, you are welcome to use it as a helper function for other functions!

Simplify Boolean Expressions in If-statements

If an if-statement evaluates to a boolean, you should use boolean expressions instead of the if-statement.

For example, y and z will always evaluate to the same value, but z is more concise and uses better style:

   let x = true
   let y = if x then not x else x
   let z = not x

Misusing Pattern-Match Expressions

The match expression is misused in two common situations:

First, match should never be used in place of an if expression (that’s why if exists). Note the following:

    begin match e with
    | true -> x
    | false -> y

    if e then x else y

The latter expression is much better, because it’s cleaner to read and easier to understand.

Another situation where if expressions are preferred over match expressions is as follows:

    begin match e with
    | c -> x   (\* c is a constant value \*)
    | _ -> y

    if e=c then x else y

The latter expression is definitely better. The other misuse is using match when pattern matching with a val declaration is enough. Consider the following:

    let x = match expr with (y,z) -> y

    let x,_ = expr

The latter is considered better, again because it is easier to read.

Other Common Misuses

Here is a bunch of other common mistakes to watch out for:

Bad Good
l::[] [l]
length + 0 length
length \* 1 length
big exp \* same big exp let x = big exp in x
if x then f a b c1 else f a b c2 f a b (if x then c1 else c2)
if a then true else false a

Don't Rewrap Functions

When passing a function around as an argument to another function, don’t rewrap the function if it already does what you want it to. Here’s an example:

    List.map (fun x -> sqrt x) [1.0; 4.0; 9.0; 16.0]

    List.map sqrt [1.0; 4.0; 9.0; 16.0]

The latter is better. Another case for rewrapping a function is often associated with infix binary operators. To prevent rewrapping the binary operator, use the operator keyword.

Consider this example:

    fold_left (fun  x y -> x + y) 0

    fold_left (+) 0

The latter is considered better style.

Avoid Repeatedly Computing the Same Values

The best way to avoid computing one value multiple times is to create a let expression and bind the computed value to a variable name.

This has the benefit of saving some CPU time and making your program look cleaner, and allows you to document the purpose of the value with a variable name – which means less commenting.

Break Up Complex Expressions

Complex expressions should be broken up by assigning sub-expressions using let ... in.

Spacing

Consistent spacing around the cons operator

Choose one of the following two spacing conventions for the cons operator. Apply your chosen convention consistently in pattern-match cases and in building new lists.

    (* 1:  Spaces on either side of the cons operator. *)
          let place_four_in_second_place_1 (l : int list) : int list =
              begin match l with
                  | [] -> []
                  | hd :: tl -> hd :: 4 :: tl
              end

    (* 2:  No spaces on either side of the cons operator. *)
          let place_four_in_second_place_2 (l : int list) : int list =
              begin match l with
                  | [] -> []
                  | hd::tl -> hd::4::tl
              end

Consistent spacing around list elements

Choose one of the following two spacing conventions for list elements. Apply your chosen convention consistently across your code.

    (* 1:  Spaces on either side of list semicolons. *)
              let my_list_1 : int list = [5 ; 6 ; 7]

    (* 2:  Spaces to the right of list semicolons. *)
              let my_list_2 : int list = [5; 6; 7]

Spaces around boolean, arithmetic, and string operators

All boolean, arithmetic, and string operators should have spaces on either side of them.

     (* Spaces on either side of boolean operators. *)
        let my_boolean : bool = (true && false) || true

     (* Spaces on either side of arithmetic operators. *)
        let my_int : int = (3 + 1) - (2 \* 7) / 16

     (* Spaces on either side of string operators. *)
        let my_string : string = "Hello," ^ " World!"

Acknowledgement: Much of this style guide is adapted from CS 312 at Cornell University.

Although the above may seem daunting, many of the guidelines are common sense and all of them make your code more readable. In the software engineering industry, some companies go so far as to dictate exactly where spaces can go. Rejoice that you do not have to learn Hungarian notation!