Records

Records types are collections of values similar to tuples, with the difference that the elements of records are named and called fields. Records are similar to struct types in C, although records are immutable.

A record type can be declared as follows:

type account = {
    first : string;
    last : string;
    birth : (int*int*int);
    id : int
  }

A record then can be created as follows (the order of fields does not matter):

let bob = {first = "Bob"; last = "Bobson"; birth = (1990, 11, 25); id = 117}

The fields of an existing record can be accessed with dot syntax:

bob.first   -->  "Bob"
bob.last    -->  "Bobson"
bob.birth   -->  (1990, 11, 25)
bob.id      -->  117

Functional update

There is a notation for creating a new record from an existing record:

let alice = {bob with first = "Alice"; id = 118}

We replaced first and id fields of bob with new values, obtaining a new record for Alice Bobson:

# let alice = {bob with first = "Alice"; id = 118} ;;
val alice : account =
  {first = "Alice"; last = "Bobson"; birth = (1990, 11, 25); id = 118}

Since, records are immutable, the old record for bob remained intact, we just copied some of its data to create a new record alice.

Pattern matching records

let get_name account =
  account.first ^ " " ^ account.last

The same functionality can be implemented if we pattern-match the record argument:

let get_name {first = fn; last = ln; birth = _ ; id = _ } =
  fn ^ " " ^ ln

or even

let get_name {first = fn; last = ln; _ } =
  fn ^ " " ^ ln

Field punning

If variables have the same name as the fields, there is a shortcut notation called "field punning" for record creating:

let make_bob () =
  let first = "Bob" in
  let last = "Bobson" in
  let birth = (1990, 11, 25) in
  let id = 117 in
  {first; last; birth; id}

Without punning, one would have to write {first = first; last = last; birth = birth; id = id} in the last line.

Similarly, filed punning also works in pattern matching:

let get_name {first; last; _ } =
  first ^ " " ^ last

Variables first and last were automatically bound to the corresponding fields of the pattern-matched record.

Field name collision

If two record types share some of the field names:

type point2d = {x : float; y : float}
type point3d = {x : float; y : float; z : float}

the type inference algorithm can be thrown off by this name collision, and you may get an unexpected result or an error, when using such types with colliding field names.

To resolve it, it is recommended to define records inside separate modules.

Modules

Most programming languages support modular programming, when code is divided into separate logical units called modules. A module consists of its interface and implementation.

In C++, classes often serve the purpose of modules, with the class declaration usually kept in a header file (interface), separately from its implementation in the corresponding cpp file.

To build large programs out of modules effectively, we need to be able to write modules that we can convince ourselves are correct in isolation from the rest of the program. Rather than have to think about every other part of the program when developing a code module, we need to be able to use local reasoning: that is, reasoning about just the module and the contract it needs to satisfy with respect to the rest of the program. (The textbook pirovides a lot of useful information on modular programming in sections 5.1 and 5.2.)

Modules in OCaml:

  • Each file defines a separate module (for example, file.ml defines the module File).

  • Modules can be defined inside other modules (with the syntax module Name = struct …​ end).

  • Modules act as name spaces preventing possible name collisions.

  • Module signatures (sig …​ and) allow to define the module’s public interface, thus specifying what parts of the module can be used and called by the outside code.

  • In C++, one can use classes and namespaces for code organization. In OCaml, you should use modules in the same situations. Exception can be made for very small classes, which can be represented by simple tuples or variant types alone.

Example 1: A functional stack data type with simple user interface

example1.ml
(* stack *)
let empty = []

let push v st = v :: st

let pop st = match st with
  | hd :: tl -> Some (hd, tl)
  | [] -> None

let print st =
  Printf.printf "Stack: [ ";
  List.iter (Printf.printf "%i ") st;
  Printf.printf "]\n"

(* main *)
let () =
  let rec loop stack =
    print stack;
    print_string ": ";

    let line = read_line () in
    if line = "pop" then
      match pop stack with
      | Some (a, stack2) ->
          Printf.printf " -----> popped %i\n" a;
          loop stack2
      | None ->
          loop stack
    else
      stack |> push (int_of_string line) |> loop
  in
  loop empty
ocamlfind ocamlopt -o example1 example1.ml
Stack: [ ]
: 10
Stack: [ 10 ]
: 20
Stack: [ 20 10 ]
: 30
Stack: [ 30 20 10 ]
: pop
 -----> popped 30
Stack: [ 20 10 ]
: pop
 -----> popped 20
Stack: [ 10 ]
: 40
Stack: [ 40 10 ]

Example 2: Define module Stack

We can group the stack functionality in a module Stack. Outside the module, its functions and types can be accessed with Stack. prefix:

example2.ml
module Stack = struct
  let empty = []

  let push v st = v :: st

  let pop st = match st with
    | hd :: tl -> Some (hd, tl)
    | [] -> None

  let print st =
    Printf.printf "Stack: [ ";
    List.iter (Printf.printf "%i ") st;
    Printf.printf "]\n"
end

let () =
  let rec loop stack =
    Stack.print stack;
    print_string ": ";

    let line = read_line () in
    if line = "pop" then
      match Stack.pop stack with
      | Some (a, stack2) ->
          Printf.printf " -----> popped %i\n" a;
          loop stack2
      | None ->
          loop stack
    else
      stack |> Stack.push (int_of_string line) |> loop
  in
  loop Stack.empty

Example 3: Divide the program into two files

The Stack module can be separated into a new file stack.ml. The file name automatically determines the name of the module, Stack, this is how you can access it from other parts of your program.

stack.ml
let empty = []

let push v st = v :: st

let pop st = match st with
  | hd :: tl -> Some (hd, tl)
  | [] -> None

let print st =
  Printf.printf "Stack: [ ";
  List.iter (Printf.printf "%i ") st;
  Printf.printf "]\n"
main.ml
let () =
  let rec loop stack =
    Stack.print stack;
    print_string ": ";

    let line = read_line () in
    if line = "pop" then
      match Stack.pop stack with
      | Some (a, stack2) ->
          Printf.printf " -----> popped %i\n" a;
          loop stack2
      | None ->
          loop stack
    else
      stack |> Stack.push (int_of_string line) |> loop
  in
  loop Stack.empty
ocamlfind ocamlopt -o example3 stack.ml main.ml

Modules and scope

module Person = struct
  type person = {name: string; supervisor: person option}
  (* the name of the person and the information who is their supervisor *)

  let name p = p.name

  (* We say that there two types of persons:
     employees who have supervisors, and bosses who don't have supervisors *)
  let make_employee n s = {name = n; supervisor = Some s}
  let make_boss n = {name = n; supervisor = None}

  let rec find_boss p =
    match p.supervisor with
    | Some p2 -> find_boss p2
    | None -> p
end

Functions and types are accessible using Person. prefix:

let () =
  let bob = Person.make_boss "Bob" in
  let dave = Person.make_employee "Dave" bob in
  let mark = Person.make_employee "Mark" dave in
  mark |> Person.find_boss |> Person.name |> print_endline
Bob
type group =
    One of Person.person
  | Two of (Person.person * Person.person)
  | Many of (Person.person list)

Accessing fields of a record defined inside a module

If a record type is defined in a module M, its fields are accessible by prefixing them with M.:

  • creating new records: {M.field1 = a ; filed2 = b; …​},

  • accessing fields of an existing record: x.M.field. Example:

let () =
  let alice = {Person.name = "Alice"; supervisor = None} in
  let carol = {Person.name = "Carol"; supervisor = Some alice} in
  print_endline alice.Person.name;
  print_endline ((Person.find_boss carol).Person.name)

Notice that when creating new records, we don’t have to specify the module name for each field, it is enough to do so only for the first field. In other words, the following code is correct, but redundant:

{Person.name = "Alice"; Person.supervisor = None}

Having said that record fields can be accessed with the x.M.field syntax, you often get cleaner programming interface if you keep the fields hidden from the end user of your module, and provide functional interface to access and modify the fields you need.

Compare:

let () =
  let alice = {Person.name = "Alice"; supervisor = None} in
  let carol = {Person.name = "Carol"; supervisor = Some alice} in
  print_endline alice.Person.name;
  print_endline ((Person.find_boss carol).Person.name)
let () =
  let alice = Person.make_boss "Alice" in
  let carol = Person.make_employee alice in
  print_endline (alice |> Person.name);
  print_endline (carol |> Person.find_boss |> Person.name)

The second solution is preferred.

It is possible to use a module signature to specify the module interface, hiding private information about the module internal structure and its implementation from the user.

Then, if the type Person.person changes at any point in the future (by gaining new fields or becoming a variant type, for example), a user of the module would not notice that change, as long as the module interface remained the same.

Opening modules

let () =
  let bob = Person.make_boss "Bob" in
  let dave = Person.make_employee "Dave" bob in
  let mark = Person.make_employee "Mark" dave in
  mark |> Person.find_boss |> Person.name |> print_endline
open Person

let () =
  let bob = make_boss "Bob" in
  let dave = make_employee "Dave" bob in
  let mark = make_employee "Mark" dave in
  mark |> find_boss |> name |> print_endline

This approach of simply opening the whole module is discouraged. It pollutes the namespace, and makes unfamiliar programs hard to understand: if multiple modules are open, it becomes non-obvious where each function is coming from.

2) Open the module locally:

let () =
  let open Person in
  let bob = make_boss "Bob" in
  let dave = make_employee "Dave" bob in
  let mark = make_employee "Mark" dave in
  mark |> find_boss |> name |> print_endline

3) Giving the module a short alias name:

module P = Person

let () =
  let bob = P.make_boss "Bob" in
  let dave = P.make_employee "Dave" bob in
  let mark = P.make_employee "Mark" dave in
  mark |> P.find_boss |> P.name |> print_endline

4) M.(e) syntax:

There is a syntax sugar M.(e) for local module opening that helps to delimit the expression where the module is opened:

let () =
  Person.( let bob = make_boss "Bob" in
    let dave = make_employee "Dave" bob in
    let mark = make_employee "Mark" dave in
    mark |> find_boss |> name )
  |> print_endline

This is the preferred way for opening modules. See another example with module List:

let sq x = x * x in
let large x = x > 5 in
List.( [1;2;3;4] |> map sq |> filter large |> fold_left (+) 0 ) ;;
- : int = 25

Avoid global module opening.
M.(e) syntax is recommended.

Signatures

See textbook, section 5.5.

Functional Data Structures

See textbook, sections 5.6 - 5.11.