Option type. Tail call optimization. Map, filter, iter

When a function cannot return an answer

Example 1: Maximum element

How to implement a function that finds the maximum element in a list?

One possible solution:

let rec list_max ls =
  match ls with
  | [a] -> a
  | hd :: tl -> max hd (list_max tl)

let () =
  let m = list_max [10; 12; 45; 34] in
  Printf.printf "Largest element is %i\n" m

This function will crash with an exception Match_failure if the list is empty. (And, of course, the compiler will give a warning at compile time that the pattern matching is not exhaustive.)

Example 2: Index of an element in a list

Find the index of an element v in a list ls. Indices start at 0.

let index v ls =
  let rec find i ls =
    match ls with
    | hd :: tl -> if hd = v then i else find (i+1) tl
    | [] -> (-1)  (* can we do better than returning -1? *)
  in
  find 0 ls

let () =
  let v = 12 in
  let i = index v [10; 12; 45; 34] in
  if i = -1 then
    Printf.printf "Not found\n"
  else
    Printf.printf "Found at the index %i\n" i

You can only rely on the programmer’s diligence to check for the error code -1 each time the function is used. Very likely, the function will be misused once in a while, possibly leading to the program crashing.

NULL References: the billion dollar mistake.

Here is what Tony Hoare has to say about NULL reference (or pointer), which he implemented in Algol W programming language back in 1965:

(Watch his talk here: Null References: The Billion Dollar Mistake, it is very good.)

I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

Option type

The option type, which is defined as follows (we will talk about type declarations in the next lecture):

type 'a option = None | Some of 'a

can assume two possible values:

None, which means: there is no answer,
Some v, which means: here is the answer v, where v can be any OCaml value.

let index v ls =
  let rec find i ls =
    match ls with
    | hd :: tl -> if hd = v then Some i else find (i+1) tl
    | [] -> None
  in
  find 0 ls

let () =
  let v = 12 in
  match index v [10; 12; 45; 34] with
  | Some i -> Printf.printf "Found at the index %i\n" i
  | None -> Printf.printf "Not found\n"

If the solution is not found, you return None, otherwise, if the solution is x, you return Some x.

Since the function returns int option instead of simple int, its return values are not numbers anymore, they take a form of Some _ or None and can be distinguished using pattern matching.

let rec list_max = function
  | [] -> None
  | hd::tl ->
      begin match list_max tl with
        | None -> Some hd
        | Some m -> Some (max hd m)
      end

let () =
  match list_max [10; 12; 45; 34] with
  | Some m -> Printf.printf "Largest element is %i\n" m
  | None -> Printf.printf "The list is empty\n"

Since the return value is Some _ or None, we have to match it in the recursive call, so need to use nested pattern matching.

Another variant of list_max, where we use a helper function with an accumulator variable best, which tracks what is the largest element so far:

let list_max ls =
  match ls with
  | [] -> None
  | hd :: tl ->
      let rec find_max best ls =
        match ls with
        | [] -> best
        | hd :: tl -> find_max (max hd best) tl
      in
      Some (find_max hd tl)

Exercises

List of square roots.

square_roots [4.0; -9.0; 5.0; -8.0]

--> [Some 2.; None; Some 2.23606797749979; None]

(You can use the function sqrt : float → float for computing actual square roots.)

🐤 click to open

let rec square_roots ls =
  match ls with
  | hd :: tl ->
      let x =
        if hd >= 0.0 then Some (sqrt hd) else None
      in
      x :: square_roots tl
  | [] -> []

Max of two lists. Assume the function list_max is provided, write another function that takes two lists ls1 and ls2 and finds the largest number among elements of both lists.

list_max2 [12; 35; 7] [18; 22] --> Some 35
list_max2 [] [22] --> Some 22
list_max2 [] [] --> None

🐤 click to open

let list_max2 ls1 ls2 =
  match list_max ls1, list_max ls2 with
  | Some x, Some y -> Some (max x y)
  | Some x, None -> Some x
  | None, Some y -> Some y
  | None, None -> None

A varaint

let list_max2 ls1 ls2 =
  match list_max ls1, list_max ls2 with
  | Some x, Some y -> Some (max x y)
  | optx, None -> optx  (* captures None,None and Some_,None *)
  | None, opty -> opty  (* captures None,Some_ *)

Maximum positive increase between two consecutive elements in a list. Return None if there is no increase in the list, either because the list is too short or because all consecutive changes are not positive.

max_incr [12; 35; 77; -72] --> Some 42  (* 35 -> 77 *)
max_incr [200; 100; 15] --> None        (* no increase *)
max_incr [123; 123; 123] --> None       (* no increase *)
max_incr [10] --> None                  (* no increase *)

🐤 click to open

using a helper filtering function

let rec filter_pos_diff ls =
  match ls with
  | a :: b :: tl ->
      let diff = b - a in
      if diff > 0 then diff :: filter_pos_diff (b :: tl) else []
  | _ -> []

let max_incr ls = list_max (filter_pos_diff ls)

a more direct solution

let rec max_incr ls =
  match ls with
  | a :: b :: tl ->
      let diff = b - a in
      let tl_answer = max_incr (b::tl) in
      if diff > 0 then
        match tl_answer with
        | Some mdiff -> Some (max diff mdiff)
        | None -> Some diff
      else
        tl_answer
  | _ -> None

Tail call optimization

Consider two functions:

let rec sum ls =
  match ls with
  | hd :: tl -> hd + sum tl
  | [] -> 0

let rec sum2 acc ls =
  match ls with
  | hd :: tl -> sum2 (acc + hd) tl
  | [] -> acc

# sum [10; 20; 30; 40] ;;
- : int = 100

# sum2 0 [10; 20; 30; 40] ;;
- : int = 100

A function is tail recursive if it calls itself recursively but does not perform any computation after the recursive call returns, and immediately returns to its caller the value of its recursive call. The recursive calls in such functions are called tail calls.

Which of the above two function is tail recursive, sum or sum2?

🐤 click to open

sum2 is tail recursive.

Functional languages like OCaml optimize tail calls so that the execution does not create a new call stack frame for each call. Instead, each successive call is reusing the same call stack frame: the caller’s stack-frame is popped before the call—the callee’s stack-frame just replaces the caller’s. (Modern C and C++ compilers also implement this optimization, in gcc it is turned on with the -O3 flag.)

No optimization (function `sum`)

Tail call optimization (function `sum2`)

With tail call optimization, the used space is O(1) instead of O(n). That’s a hugely useful feature, sometimes making a recursive program as efficient as a program written using a while loop in an imperative language.

When you have a choice between using a tail-recursive vs. non-tail-recursive function, you are likely better off using the tail-recursive function on really long lists to achieve space efficiency. For that reason, the List module documents which functions are tail recursive and which are not.

Tail recursive programs can also implement infinite loops, where non-tail-recursive programs would fail:

let rec iterate i =
  Printf.printf "%i " i;
  iterate (i+1);
  ()

# iterate 0 ;;
0 1 2 3 4 ... (skip) ... 262045 262046 262047 262048 262049 262050 262051
Stack overflow during evaluation (looping recursion?).

let rec iterate_tail i =
  Printf.printf "%i " i;
  iterate_tail (i+1)

# iterate_tail 0 ;;
0 1 2 3 4 ... (This is an infinite loop, it will run forever)

Tail recursive functions are conceptually similar to loops. They iterate through a sequence of steps. However, not all computation is iterative like that. Computationally hard problems require several recursive calls (like the naive Fibonacci number algorithm does). If the function must make more than one recursive call, it cannot be made tail recursive. (Which also implies exponential running time, such as O(2ⁿ).)

An example of such an inherently non-loop-like problem is the Partition problem: Given a list, you have to decide if it is possible to divide the list into two parts whose sums are equal

partition_exist [10; 22; 45; 18; 5] -> true
partition_exist [10; 22; 45; 18] -> false

partition_find [10; 22; 45; 18; 5] -> Some ([10;22;18], [45;5])
partition_find [10; 22; 45; 18] -> None

Can you design a recursive solution?

Generally, try writing tail recursive code rather than non-tail-recursive whenever possible. Especially if processing large amounts of data.

Exercises

A problem with length and list_of_range.

let rec list_of_range l r =
  if l <= r then
    l :: list_of_range (l+1) r
  else
    []

let rec length ls =
  match ls with
  | hd :: tl -> 1 + length tl
  | [] -> 0

The following code crashes:

# length (list_of_range 1 1000000) ;;
Stack overflow during evaluation (looping recursion?).

How to fix that?

🐤 click to open

Define both functions in the tail recursive manner

let list_of_range l r =
  let rec next acc mid =
    if l <= mid then
      next (mid :: acc) (mid-1)
    else
      acc
  in
  next [] r

let length ls =
  let rec next acc ls =
    match ls with
    | hd :: tl -> next (acc+1) tl
    | [] -> acc
  in
  next 0 ls

# length (list_of_range 1 1000000) ;;
- : int = 1000000

Factorial: non-tail-recursive and tail-recursive.

Implement the factorial function in the non-tail-recursive and tail-recursive way.

fact 5 --> 120
fact_tail 5 --> 120

🐤 click to open

non-tail-recursive

let rec fact n =
  if n <= 1 then 1
  else n * fact (n-1)

tail-recursive

let fact_tail n =
  let rec product acc n =
    if n <= 1 then
      acc
    else
      product (acc * n) (n - 1)
  in
  product 1 n

Higher-order functions: map, filter, iter

Map

All functions that transform each element of a list, producing a new list of the same length, can be generalized using the map function.

Consider the following two examples:

increment_all [1; 2; 3; 4; 5] --> [2; 3; 4; 5; 6]
parity [1; 2; 3; 4; 5] --> [true; false; true; false; true]

let rec increment_all ls =
  match ls with
  | hd :: tl -> (hd + 1) :: increment_all tl
  | [] -> []

let rec parity ls =
  match ls with
  | hd :: tl -> (hd mod 2 <> 0) :: parity tl
  | [] -> []

If we introduce helper functions, the structure of the recursive functions are identical and differ only in the transformation function that is applied to each element of the list:

let inc x = x + 1
let par x = x mod 2 <> 0

let rec increment_all ls =
  match ls with
  | hd :: tl -> (inc hd) :: increment_all tl
  | [] -> []

let rec parity ls =
  match ls with
  | hd :: tl -> (par hd) :: parity tl
  | [] -> []

We can define a general function map that receives two parameters: a function f that can transform one element at a time, such as inc or par, and a list ls, which we want to manipulate:

let rec map f ls =
  match ls with
  | hd :: tl -> (f hd) :: map f tl
  | [] -> []

The full type signature of the function:

val map : ('a -> 'b) -> 'a list -> 'b list

It means that the function takes two parameters: a function that maps ('a → 'b) and a list of elements of type 'a; and outputs a list of elements of type 'b.

Then increment_all and parity are specific cases of map and can be implemented as:

let inc x = x + 1
let par x = x mod 2 <> 0

let increment_all ls = map inc ls
let parity ls = map par ls

Alternatively, we can pass anonymous functions for the first argument, achieving the same result:

let increment_all ls = map (fun x -> x + 1) ls
let parity ls = map (fun x -> x mod 2 <> 0) ls

Also see a Stack Overflow question on how to implement a tail recursive map, using rev function that reverses the accumulated list.

Filter

Function filter is a generalization of a function like

keep_even [1; 2; 3; 4; 5] --> [2; 4]

If the condition holds (number is even), you keep the element, and discard it otherwise:

let rec keep_even ls =
  match ls with
  | hd :: tl ->
      if hd mod 2 = 0 then
        hd :: keep_even tl
      else
        keep_even tl
  | [] -> []

A general filtering function that requires a predicate function f that tests whether the elements should be kept or not:

let rec filter f ls =
  match ls with
  | hd :: tl ->
      if f hd then
        hd :: filter f tl
      else
        filter f tl
  | [] -> []

The type signature of the function is

val filter : ('a -> bool) -> 'a list -> 'a list

The first argument is the predicate 'a → bool that checks if an element is good to keep, and the second argument is the actual list.

In programming, a function that returns bool is called predicate. This terminology is borrowed from logic.

Usage examples:

filter (fun x -> x > 0) [10; -15; 12; 13]
  --> [10; 12; 13]   (* keep positive *)

filter (fun x -> x mod 2 = 0) [10; -15; 12; 13]
  --> [10; 12] (* keep even *)

Iter

If we want to perform some side-effect computation on each element of the list, for example print it out:

let rec print ls =
  match ls with
  | hd :: tl ->
      Printf.printf "%i " hd;
      print tl
  | [] -> ()

# print [10; 30; 50; 28; 17] ;;
10 30 50 28 17 - : unit = ()

This computation is a specific case of the function iter:

let rec iter f ls =
  match ls with
  | hd :: tl ->
      f hd;
      iter f tl
  | [] -> ()

The type signature of the function is

val iter : ('a -> unit) -> 'a list -> unit

The first argument is a function 'a → unit that performs a side-effect computation on an element, and the second argument is the list. The return type is unit.

A usage example:

let print_one_pair (x,y) =
  Printf.printf "%i * %i = %i\n" x y (x*y)

let () =
  iter print_one_pair [(2,5); (3,-7); (11, 11)]

Prints out:

2 * 5 = 10
3 * -7 = -21
11 * 11 = 121

Exercise

Transform the string "12 -5 8 100 -12" into the list of integers keeping only positive numbers:

positive "12 -5 8 100 -12" --> [12; 8; 100]

Some functions you can use:

String.split_on_char c s that splits the string s on character c, returning a list of strings,
int_of_string s that converts the string s into integer

🐤 click to open

let positive s =
  filter (fun x -> x > 0) (map int_of_string (String.split_on_char ' ' s))

The same solution written in a more explicit way:

let positive s =
  let ls = String.split_on_char ' ' s in
  let nums = map int_of_string ls in
  filter (fun x -> x > 0) nums

Module List

Explore the functions in the module List. We already know many of them:

List.length
List.append
List.flatten
List.rev
List.mem (we called it contains)
List.map
List.filter
List.iter

There are more functions in this module, spend time exploring it.

You are ready, and please feel free, to use any of the functions from the module List.

Option type. Tail call optimization. Map, filter, iter

When a function cannot return an answer

Example 1: Maximum element

Example 2: Index of an element in a list

NULL References: the billion dollar mistake.

Option type

Exercises

Tail call optimization

No optimization (function sum)

Tail call optimization (function sum2)

Exercises

Higher-order functions: map, filter, iter

Map

Filter

Iter

Exercise

Module List

No optimization (function `sum`)

Tail call optimization (function `sum2`)