Common built-in types

Type

Description

Examples

int

Integer, 63-bit wide on most modern systems, from -262 to 262-1

0, 1, -1, 2, -2 …​

bool

Boolean

true or false

float

Float, 64-bit double precision floating-point number (IEEE 754 standard)

1.23, 5.32e-10, 7., …​

char

ASCII character

'a', 'b', …​ (not an integer type like in C and C++)

string

String, an immutable sequence of characters

"cat", "dog", "", …​

unit

Unit, similar to void in C. Unit is returned by functions doing side-effects, such as printing functions

(), only one value possible

For reference on these types and operations on them, see the Standard Library module Pervasives:

Explicit type conversion

There is no implicit type conversion in OCaml. Common type conversion functions can be found in the Pervasives module:

float_of_int

int → float

Convert an integer to floating-point.

int_of_float

float → int

Truncate the given floating-point number to an integer. The result is unspecified if the argument is nan or falls outside the range of representable integers.

int_of_char

char → int

Return the ASCII code of the argument.

char_of_int

int → char

Return the character with the given ASCII code. Raise Invalid_argument "char_of_int" if the argument is outside the range 0—​255.

string_of_bool

bool → string

Return the string representation of a boolean.

bool_of_string

string → bool

Convert the given string to a boolean. Raise Invalid_argument "bool_of_string" if the string is not "true" or "false".

string_of_int

int → string

Return the string representation of an integer, in decimal.

int_of_string

string → int

Convert the given string to an integer. Raise Failure "int_of_string" if the given string is not a valid representation of an integer, or if the integer represented exceeds the range of integers representable in type int.

string_of_float

float → string

Return the string representation of a floating-point number.

float_of_string

string → float

Convert the given string to a float. The _ (underscore) character can appear anywhere in the string and is ignored. Raise Failure "float_of_string" if the given string is not a valid representation of a float.

Printing

To print anything, you can use either the basic printing functions from the module Pervasives, or a more advanced Printf module, which we will discuss later.

Basic printing functions for each type

print_char

char → unit

Print a character on standard output.

print_string

string → unit

Print a string on standard output.

print_int

int → unit

Print an integer, in decimal, on standard output.

print_float

float → unit

Print a floating-point number, in decimal, on standard output.

print_endline

string → unit

Print a string, followed by a newline character, on standard output and flush standard output.

print_newline

unit → unit

Print a newline character on standard output, and flush standard output. This can be used to simulate line buffering of standard output.

# print_string "Hello world!" ;;
Hello world!- : unit = ()

# print_int (10 + 20) ;;
30- : unit = ()

To concatenate two strings, you can use the string concatenation operator ^:

"hello " ^ "world" --> "hello world"

"number " ^ (string_of_int 10) --> "number 10"

Unit type

Notice that all printing functions return type unit. The only value of this type is denoted by (). It represents nothing returned as a valid value, contrasting with languages like C, where void does not have a corresponding value.

Just like the empty set ∅ in mathematics is a valid set, the nothing returned situation is still something, and is represented by () of type unit.

When compared to a function call that does not return anything because it gets stuck in an infinite loop, because it crashes, or is simply not defined for the passed arguments, returning unit is a quite well-defined outcome.

Function that do not require arguments should receive () as an input value:

# let print_five () = print_string "5\n" ;;
val print_five : unit -> unit = <fun>

# print_five () ;;
5
- : unit = ()

The function print_newline is another example of such a unit → unit function. Because it just prints a new line character, it does not require any real arguments passed, and it does not need to return anything, which fits its type unit → unit.

Module Printf

Function printf of the module Printf works the same as printf if C:

# Printf.printf "%s world! %i\n" "Hello" 123 ;;
Hello world! 123
- : unit = ()

Function sprintf returns the produced text as a string value instead of printing it out:

# let s = Printf.sprintf "%s %i" "Hi" 1000 ;;
val s : string = "Hi 1000"

Exercise

Printing 3D coordinates. Given three floating-point numbers x, y, z, print them out as (x, y, z)

# print_3d_coords 2.4 1.33 5.0 ;;
(2.4, 1.33, 5.)- : unit = ()

🐤 click to open

Solution 1 using conversions:

let print_3d_coords x y z =
  let xs = string_of_float x in
  let ys = string_of_float y in
  let zs = string_of_float z in
  let msg = "(" ^ xs ^ ", " ^ ys ^ ", " ^ zs ^ ")" in
  print_string msg
# print_3d_coords 12.3 43.3 121.4e10 ;;
(12.3, 43.3, 1.214e+12)- : unit = ()

Solution 2 using Printf:

let print_3d_coords x y z =
    Printf.printf "(%g, %g, %g)" x y z
# print_3d_coords 12.3 43.3 121.4e10 ;;
(12.3, 43.3, 1.214e+12)- : unit = ()

How to execute a sequence of printing functions?

Actually, we already know one mechanism to do that: Let-in binding. It wouldn’t look very pretty though:

let print_greeting name =
  let res1 = print_string "Hello " in
  let res2 = print_string name in
  print_string "!\n"
# print_greeting "Alice" ;;
Hello Alice!
- : unit = ()

However, it is recommended though to use the semicolon operator ; instead.

Semicolon operator

Multiple expressions, whose results can be discarded, can be joined by a ; operator:

let print_greeting name =
  print_string "Hello ";
  print_string name;
  print_string "!\n"
# print_greeting "Bob" ;;
Hello Bob!
- : unit = ()

Notice the lack of ; after the final print_string "!\n", because ; is not a statement terminator like in C-style languages. e1 ; e2 joins two expressions, not terminating an expression.

When a compound expression e1; e2 is evaluated:

  • first e1 gets evaluated and its result is discarded,

  • after that e2 gets evaluated, and its value is returned

The OCaml semicolon operator ; is more like the comma operator , in C and C++.

Also, since ; is a right-associative operator, when you use multiple semicolons, the implicit order is:

a ; b ; c ; d   =   a ; (b ; (c ; d))

Which means, first compute a, discard it, then compute the rest. In the end, the result of d is returned.

A caveat on semicolons inside if expressions

When you want to use a semicolon inside a then or else clause of an if expression, you must put your expression in parentheses ( …​ ) or begin …​ end brackets:

let smaller a b =
  if a < b then (
    print_string "first is ";
    print_string "smaller"
  )
  else (
    print_string "second is ";
    print_string "smaller"
  )

Most OCaml style guides tend to suggest begin end in favor of parentheses in such a situation:

let smaller a b =
  if a < b then begin
    print_string "first is ";
    print_string "smaller"
  end else begin
    print_string "second is ";
    print_string "smaller"
  end

In all other cases, semicolons mean that the expression you compute continue and no parentheses are needed.

The observed behavior happens because if has higher precedence than ;. The are use cases though when this convention makes sense. Please see the example below:

let print_properties a =
  if a > 0 then Printf.printf "Positive\n";
  if a < 0 then Printf.printf "Negative\n";
  if a = 0 then Printf.printf "Zero\n";
  if a mod 2 = 0 then
      Printf.printf "Even\n"
    else
      Printf.printf "Odd\n";
  if a >= 0 then Printf.printf "Natural\n"
# print_properties 100 ;;
Positive
Even
Natural
- : unit = ()

# print_properties (-7) ;;
Negative
Odd
- : unit = ()

Exercises

Print a list of strings

print_list ["river"; "bridge"; "town"; "square"] --> ()
(* prints:

river
bridge
town
square

*)

🐤 click to open

let rec print_list ls =
  match ls with
  | hd :: tl ->
      print_endline hd;
      print_list tl
  | [] -> ()

Print a list of integers separated by commas (but no trailing or leading commas)

print_list_commas [15; 18; 23; 4; 11] --> ()
(* prints:  15, 18, 23, 4, 11  *)

🐤 click to open

let rec print_list_commas ls =
  match ls with
  | a :: b :: tl ->
      print_int a;
      print_string ", ";
      print_list_commas (b::tl)
  | a :: [] ->
      print_int a
  | [] -> ()

We are ready to write a full OCaml program and compile it

hello.ml
(* all necessary declarations go first *)
let hello name =
  Printf.printf "Hello %s!\n" name

let () =  (* "sort of" main function *)
  hello "world";
  hello "everyone"

Put all the necessary declarations first. Then, make the last declaration do I/O, it normally returns unit. You can treat it as as your main function.

$ ocamlopt -o hello hello.ml
$ ./hello
Hello world!
Hello everyone!

Tuples

How to make a function return more than one value?

One possibility is to make it return a list:

let max_and_min a b c =
  let big = max (max a b) c in
  let small = min (min a b) c in
  [big; small]
# max_and_min 3 7 1 ;;
- : int list = [7; 1]

However, if we try to use the list returned by the function, we will get a warning:

let their_sum =
  match max_and_min 3 7 1 with
  | a :: b :: [] -> a+b ;;

Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(_::_::_::_|_::[]|[])

val their_sum : int = 8

The result is computed (equals 8), but the compiler complained that the pattern matching is not exhaustive: lists of length 0, 1, 3, or more, cannot be matched by that pattern matching expression. The type of the function max_and_min : 'a → 'a → 'a → 'a list cannot guarantee that it always returns a list with two elements.

An even bigger problem with returning a list is that all list elements must of the same type. You cannot mix several types in one list:

# let ls = [123; "cat"] ;;
Error: This expression has type string but an expression was expected of type
         int

Tuple types

When a fixed number of values of possibly different types must be grouped together, use tuples.

(1, "some text")
(10, 20.53, "Alice", [12; 343], "Bob")

Their types are:

(1, "some text") :
      int * string

(10, 20.53, "Alice", [12; 343], "Bob") :
      int * float * string * int list * string

Tuple values correspond to ordered pairs in mathematics and the tuple types correspond to Cartesian products. For example, an OCaml value (15, 1.765) : int * float would correspond to (15, 1.765) ∈ Integers × Floats.

Examples

let max_and_min a b c =
  let big = max (max a b) c in
  let small = min (min a b) c in
  (big, small)
# max_and_min 3 7 1 ;;
- : int * int = (7, 1)

# let pair = max_and_min 3 7 1 ;;
val pair : int * int = (7, 1)

The individual elements can be extracted from a tuple using another form of pattern-matching:

# let (a, b) = pair ;;
val a : int = 7
val b : int = 1

You can use a wildcard _ if you need only some of the values from a tuple:

# let (big, _) = max_and_min 3 7 1 ;;
val big : int = 7

# let (_, x, y, _, _) = ("Alice", "Bob", "Carol", 1995, 2055) ;;
val x : string = "Bob"
val y : string = "Carol"

Lists should be used to represent collections of elements of the same type, which can contain any number of elements (0+).

Tuples, on the other hand, should be used to represent collections of fixed size, which can contain values of any type. Also, you cannot iterate over elements of a tuple the way it is possible for lists.

Actually, all forms of name binding use pattern matching

The three functions below are identical, but show different ways how individual components can be extracted from a tuple:

let sum1 triple =
  match triple with
  | (a, b, c) -> a + b + c

let sum2 triple =
  let (a, b, c) = triple in
  a + b + c

let sum3 (a, b, c) =
  a + b + c

Variables a, b, and c are matched to the three elements of the tuple that is passed into the function.

The truth is that all forms of name binding are using pattern matching. Whenever a variable is bound to a value, it is matched through pattern matching, even in a trivial example like

let x = 15   (* here x is a trivial pattern *)
let _ = 15   (* we used a wildcard, so 15 was left
                not assigned to anything *)

Exercise

Perimeter and Area. Given the width and the height of a rectangle, find its perimeter and area:

perimeter_and_area 2 3 --> (10, 6)

🐤 click to open

let perimeter_and_area w h =
  let per = (w + h) * 2 in
  let area = w * h in
  (per, area)

Matching multiple values?

Same length. The function determines whether or not the two lists have the same length:

same_length [1;2;3] [5;6;7] --> true
same_length [1;2;3] [5;6;7;8] --> false

While one could use the function length, which we defined in the previous lecture (length ls1 = length ls2), such an implementation would be inefficient if one list is much shorter than the other, the running time will be O(n+m), where n and m are the lengths of the lists. How to achieve O(min(n, m)) running time?

  • if both lists are empty then they have equal length → true

  • if only one list is empty then they have different length → false

  • if both are non-empty → discard their heads and compare tails

10 :: 20 :: 30 :: 40 :: 50 :: []
33 :: 34 :: 35 :: []
-->
20 :: 30 :: 40 :: 50 :: []
34 :: 35 :: []
-->
30 :: 40 :: 50 :: []
35 :: []
-->
40 :: 50 :: []
[]
--> false

The question is: How to match the two lists? Can we use nested pattern matching?

let rec same_length ls1 ls2 =
  match ls1 with
  | hd1 :: tl1 ->
      begin match ls2 with
      | hd2 :: tl2 -> same_length tl1 tl2
      | [] -> false
      end
  | [] ->
      begin match ls2 with
      | _ :: _ -> false
      | [] -> true
      end

This works, however using nested pattern matching is generally discouraged, because there is some duplication of code and because we can do better.

Why don’t we match both lists at the same time? Or more specifically, let us match the tuple (ls1, ls2):

let rec same_length ls1 ls2 =
  match (ls1, ls2) with
  | (h1::t1, h2::t2) -> same_length t1 t2
  | ([], []) -> true
  | _ -> false

Were it is unambiguous, parentheses for tuples can be omitted, resulting in cleaner code:

let rec same_length ls1 ls2 =
  match ls1, ls2 with
  | h1::t1, h2::t2 -> same_length t1 t2
  | [], [] -> true
  | _ -> false

Another example with omitting parentheses:

# let x, y, _ = "cat", 15, "dog"  ;;
val x : string = "cat"
val y : int = 15

Tuples is a good way to return multiple values from a function or to assign several variables.

Exercises

"Similar" lists.

Def. Let us call two integers similar if they differ at most by one, |x − y| ≤ 1.
Def. Let us call two lists of integers similar if their corresponding elements are similar.

Define a function that checks the similarity of two lists:

similar [1; 20; 300] [1; 21; 299] --> true
similar [1; 20; 300] [3; 20; 298] --> false
similar [1; 20; 300] [1; 21] --> false

(You can use function abs: int → int for computing absolute value.)

🐤 click to open

let rec similar ls1 ls2 =
  match (ls1, ls2) with
  | h1::t1, h2::t2 ->
      (abs (h1-h2) <= 1) && similar t1 t2
  | [], [] -> true
  | _ -> false

Largest in each pair. Given a list of integer pairs, find the largest number in each pair:

max_in_each_pair [(1, 2); (3, 4); (100, 99); (5, -6)] --> [2; 4; 100; 5]

🐤 click to open

let rec max_in_each_pair ls =
  match ls with
  | (a, b) :: tl -> (max a b) :: max_in_each_pair tl
  | [] -> []

Print two lists of strings side by side. If one of the lists is shorter than the other, print --- instead of the missing elements:

# print_side_by_side ["C++"; "OCaml"; "Python"] ["static"; "static"; "dynamic"] ;;
C++ static
OCaml static
Python dynamic
- : unit = ()

# print_side_by_side ["cat"; "dog"; "fish"] ["purrs"; "barks"] ;;
cat purrs
dog barks
fish ---
- : unit = ()

# print_side_by_side [] ["purrs"; "barks"; "chirps"] ;;
--- purrs
--- barks
--- chirps
- : unit = ()

🐤 click to open

let rec print_side_by_side ls1 ls2 =
  match ls1, ls2 with
  | h1::t1, h2::t2 ->
      Printf.printf "%s %s\n" h1 h2;
      print_side_by_side t1 t2

  | h1::t1, [] ->
      Printf.printf "%s ---\n" h1;
      print_side_by_side t1 []

  | [], h2::t2 ->
      Printf.printf "--- %s\n" h2;
      print_side_by_side [] t2

  | [], [] -> ()

Using accumulators

How to write a function that reverses a list?

reverse [1; 20; 300; 50] --> [50; 300; 20; 1]

One inefficient way to solve it is to use the function append that was defined in the previous lecture for concatenating two lists, append [1; 2; 3] [5; 6] --> [1; 2; 3; 5; 6]:

let rec reverse ls =
  match ls with
  | hd::tl -> append (reverse tl) [hd]
  | [] -> []

However, the running time will be quadratic in the number of elements of the list:

    reverse [1; 2; 3; 4]
--> append (reverse [2; 3; 4]) [1]
--> append (append (reverse [3; 4]) [2]) [1]
--> append (append (append (reverse [4]) [3]) [2]) [1]
--> append (append (append (append (reverse []) [4]) [3]) [2]) [1]
--> append (append (append (append [] [4]) [3]) [2]) [1]
--> append (append (append [4] [3]) [2]) [1]
--> append (append [4; 3] [2]) [1]
--> append [4; 3; 2] [1]
--> [4; 3; 2; 1]

Each append ls1 ls2 operation, when evaluated, runs in O(k) time when k is the length of the list ls1. So, if the total number of elements in the input list for the reverse function is n, then the function will require 0 + 1 + 2 + …​ + (n-1) operations, which is equal to (n-1)n/2.

In functional programming, the number function calls is a good measure of the program complexity, since all loops are implemented as recursive function calls.

A better solution: Reversing a list using an accumulator argument

Consider the following implementation:

let rec reverse acc ls =
  match ls with
  | hd::tl -> reverse (hd::acc) tl
  | [] -> acc

Evaluation (runs in linear time now):

    reverse [] [1; 2; 3; 4]
--> reverse [1] [2; 3; 4]
--> reverse [2; 1] [3; 4]
--> reverse [3; 2; 1] [4]
--> reverse [4; 3; 2; 1] []
--> [4; 3; 2; 1]

However, notice that now to reverse a list, we needed to pass one more argument when calling the function:

reverse [] ls

This argument is initially [], but as we go through the list, the accumulator accumulates the list elements in reverse order, and in the end it is returned.

However, that extra [] is clearly an implementation detail, not a real part of the programming interface. Can we get rid of that []? There is an easy solution though to fix the problem: the function reverse can be wrapped into another function that calls it with [] for us:

let rev ls =
  let rec reverse acc ls =
    match ls with
    | hd::tl -> reverse (hd::acc) tl
    | [] -> acc
  in
  reverse [] ls
rev [1;2;3;4] --> [4;3;2;1]

Recursive Fibonacci numbers using accumulators

let fib n =
  match n with
  | 0 -> 0
  | 1 -> 1
  | _ ->
      let rec iter (a, b) i =
        if i = n then
          b
        else
          iter (b, a+b) (i+1)
      in
      iter (0, 1) 1

Exercises

Sum up all even and all odd elements.

sum_even_odd [1; 20; 3; 40; 100] --> (160, 4)

(Explanation: 20 + 40 + 100 = 160 and 1 + 3 = 4)

🐤 click to open

(* a helper function *)
let rec sum (evens, odds) ls =
  match ls with
  | hd::tl ->
      if hd mod 2 = 0 then
        sum (hd + evens, odds) tl
      else
        sum (evens, hd + odds) tl
  | [] -> (evens, odds)

let sum_even_odd ls =
  sum (0, 0) ls

An even better solution would define the helper function locally using let-in:

let sum_even_odd ls =
  let rec sum (evens, odds) ls =
    match ls with
    | hd::tl ->
        if hd mod 2 = 0 then
          sum (hd + evens, odds) tl
        else
          sum (evens, hd + odds) tl
    | [] -> (evens, odds)
  in
  sum (0, 0) ls

Using accumulator variables like evens and odds, which start empty, and accumulate its value over function execution is a very common technique in functional programming.

Mini-project: Sorting command-line arguments

Implement a program sort_argv that takes the command-line arguments (argv in C and C++) and prints them out sorted in the usual lexicographic order separated by spaces.

If run like this:

./sort_argv red blue black pink yellow green

Expected output:

black blue green pink red yellow

Hint 1: Comparison operatos < and <= are already defined to compare strings in the lexicographic order.

Hint 2: The command-line arguments can be obtained from the Sys.argv array, which you can transform into a list of strings using the function Array.to_list:

let argv_list = Array.to_list Sys.argv