When a function cannot return an answer
Example 1: Maximum element
How to implement a function that finds the maximum element in a list?
One possible solution:
let rec list_max ls =
match ls with
| [a] -> a
| hd :: tl -> max hd (list_max tl)
let () =
let m = list_max [10; 12; 45; 34] in
Printf.printf "Largest element is %i\n" m
This function will crash with an exception |
Example 2: Index of an element in a list
Find the index of an element v
in a list ls
. Indices start at 0.
let index v ls =
let rec find i ls =
match ls with
| hd :: tl -> if hd = v then i else find (i+1) tl
| [] -> (-1) (* can we do better than returning -1? *)
in
find 0 ls
let () =
let v = 12 in
let i = index v [10; 12; 45; 34] in
if i = -1 then
Printf.printf "Not found\n"
else
Printf.printf "Found at the index %i\n" i
You can only rely on the programmer’s diligence to check for the error code |
NULL References: the billion dollar mistake.
Here is what Tony Hoare has to say about NULL
reference (or pointer), which he implemented in Algol W
programming language back in 1965:
(Watch his talk here: Null References: The Billion Dollar Mistake, it is very good.)
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
Option type
The option type, which is defined as follows (we will talk about type declarations in the next lecture):
type 'a option = None | Some of 'a
can assume two possible values:
-
None
, which means: there is no answer, -
Some v
, which means: here is the answerv
, wherev
can be any OCaml value.
let index v ls =
let rec find i ls =
match ls with
| hd :: tl -> if hd = v then Some i else find (i+1) tl
| [] -> None
in
find 0 ls
let () =
let v = 12 in
match index v [10; 12; 45; 34] with
| Some i -> Printf.printf "Found at the index %i\n" i
| None -> Printf.printf "Not found\n"
If the solution is not found, you return None
, otherwise, if the solution is x
, you return Some x
.
Since the function returns int option
instead of simple int
,
its return values are not numbers anymore, they take a form of Some _
or None
and can be distinguished using pattern matching.
let rec list_max = function
| [] -> None
| hd::tl ->
begin match list_max tl with
| None -> Some hd
| Some m -> Some (max hd m)
end
let () =
match list_max [10; 12; 45; 34] with
| Some m -> Printf.printf "Largest element is %i\n" m
| None -> Printf.printf "The list is empty\n"
Since the return value is Some _
or None
, we have to match it in the recursive call, so need to use nested pattern matching.
Another variant of list_max
, where we use a helper function with an accumulator variable best
, which tracks what is the largest element so far:
let list_max ls =
match ls with
| [] -> None
| hd :: tl ->
let rec find_max best ls =
match ls with
| [] -> best
| hd :: tl -> find_max (max hd best) tl
in
Some (find_max hd tl)
Exercises
List of square roots.
square_roots [4.0; -9.0; 5.0; -8.0]
--> [Some 2.; None; Some 2.23606797749979; None]
(You can use the function sqrt : float → float
for computing actual square roots.)
🐤 click to open
let rec square_roots ls =
match ls with
| hd :: tl ->
let x =
if hd >= 0.0 then Some (sqrt hd) else None
in
x :: square_roots tl
| [] -> []
Max of two lists. Assume the function list_max
is provided, write another function that takes two lists ls1
and ls2
and finds the largest number among elements of both lists.
list_max2 [12; 35; 7] [18; 22] --> Some 35
list_max2 [] [22] --> Some 22
list_max2 [] [] --> None
🐤 click to open
let list_max2 ls1 ls2 =
match list_max ls1, list_max ls2 with
| Some x, Some y -> Some (max x y)
| Some x, None -> Some x
| None, Some y -> Some y
| None, None -> None
let list_max2 ls1 ls2 =
match list_max ls1, list_max ls2 with
| Some x, Some y -> Some (max x y)
| optx, None -> optx (* captures None,None and Some_,None *)
| None, opty -> opty (* captures None,Some_ *)
Maximum positive increase between two consecutive elements in a list. Return None
if there is no increase in the list, either
because the list is too short or because all consecutive changes are not positive.
max_incr [12; 35; 77; -72] --> Some 42 (* 35 -> 77 *)
max_incr [200; 100; 15] --> None (* no increase *)
max_incr [123; 123; 123] --> None (* no increase *)
max_incr [10] --> None (* no increase *)
🐤 click to open
let rec filter_pos_diff ls =
match ls with
| a :: b :: tl ->
let diff = b - a in
if diff > 0 then diff :: filter_pos_diff (b :: tl) else []
| _ -> []
let max_incr ls = list_max (filter_pos_diff ls)
let rec max_incr ls =
match ls with
| a :: b :: tl ->
let diff = b - a in
let tl_answer = max_incr (b::tl) in
if diff > 0 then
match tl_answer with
| Some mdiff -> Some (max diff mdiff)
| None -> Some diff
else
tl_answer
| _ -> None
Tail call optimization
Consider two functions:
let rec sum ls =
match ls with
| hd :: tl -> hd + sum tl
| [] -> 0
let rec sum2 acc ls =
match ls with
| hd :: tl -> sum2 (acc + hd) tl
| [] -> acc
# sum [10; 20; 30; 40] ;;
- : int = 100
# sum2 0 [10; 20; 30; 40] ;;
- : int = 100
A function is tail recursive if it calls itself recursively but does not perform any computation after the recursive call returns, and immediately returns to its caller the value of its recursive call. The recursive calls in such functions are called tail calls.
Which of the above two function is tail recursive, sum
or sum2
?
🐤 click to open
sum2
is tail recursive.
Functional languages like OCaml optimize tail calls so that
the execution does not create a new call stack frame for each call. Instead, each successive call is reusing the same call stack frame:
the caller’s stack-frame is popped before the call—the callee’s stack-frame just replaces the caller’s.
(Modern C and C++ compilers also implement this optimization, in gcc
it is turned on with the -O3
flag.)
No optimization (function sum
)
Tail call optimization (function sum2
)
With tail call optimization, the used space is O(1) instead of O(n). That’s a hugely useful feature, sometimes making a recursive program as efficient as a program written using a while loop in an imperative language.
When you have a choice between using a tail-recursive vs. non-tail-recursive function, you are likely better off using the tail-recursive function on really long lists to achieve space efficiency. For that reason, the List module documents which functions are tail recursive and which are not.
Tail recursive programs can also implement infinite loops, where non-tail-recursive programs would fail:
let rec iterate i =
Printf.printf "%i " i;
iterate (i+1);
()
# iterate 0 ;; 0 1 2 3 4 ... (skip) ... 262045 262046 262047 262048 262049 262050 262051 Stack overflow during evaluation (looping recursion?).
let rec iterate_tail i =
Printf.printf "%i " i;
iterate_tail (i+1)
# iterate_tail 0 ;; 0 1 2 3 4 ... (This is an infinite loop, it will run forever)
Tail recursive functions are conceptually similar to loops. They iterate through a sequence of steps. However, not all computation is iterative like that. Computationally hard problems require several recursive calls (like the naive Fibonacci number algorithm does). If the function must make more than one recursive call, it cannot be made tail recursive. (Which also implies exponential running time, such as O(2n).)
An example of such an inherently non-loop-like problem is the Partition problem: Given a list, you have to decide if it is possible to divide the list into two parts whose sums are equal
Can you design a recursive solution? |
Generally, try writing tail recursive code rather than non-tail-recursive whenever possible. Especially if processing large amounts of data.
Exercises
A problem with length
and list_of_range
.
let rec list_of_range l r =
if l <= r then
l :: list_of_range (l+1) r
else
[]
let rec length ls =
match ls with
| hd :: tl -> 1 + length tl
| [] -> 0
The following code crashes:
# length (list_of_range 1 1000000) ;; Stack overflow during evaluation (looping recursion?).
How to fix that?
Define both functions in the tail recursive manner 🐤 click to open
let list_of_range l r =
let rec next acc mid =
if l <= mid then
next (mid :: acc) (mid-1)
else
acc
in
next [] r
let length ls =
let rec next acc ls =
match ls with
| hd :: tl -> next (acc+1) tl
| [] -> acc
in
next 0 ls
# length (list_of_range 1 1000000) ;;
- : int = 1000000
Factorial: non-tail-recursive and tail-recursive.
Implement the factorial function in the non-tail-recursive and tail-recursive way.
fact 5 --> 120
fact_tail 5 --> 120
🐤 click to open
let rec fact n =
if n <= 1 then 1
else n * fact (n-1)
let fact_tail n =
let rec product acc n =
if n <= 1 then
acc
else
product (acc * n) (n - 1)
in
product 1 n
Higher-order functions: map, filter, iter
Map
All functions that transform each element of a list, producing a new list of the same length, can be generalized using the map
function.
Consider the following two examples:
increment_all [1; 2; 3; 4; 5] --> [2; 3; 4; 5; 6]
parity [1; 2; 3; 4; 5] --> [true; false; true; false; true]
let rec increment_all ls =
match ls with
| hd :: tl -> (hd + 1) :: increment_all tl
| [] -> []
let rec parity ls =
match ls with
| hd :: tl -> (hd mod 2 <> 0) :: parity tl
| [] -> []
If we introduce helper functions, the structure of the recursive functions are identical and differ only in the transformation function that is applied to each element of the list:
let inc x = x + 1
let par x = x mod 2 <> 0
let rec increment_all ls =
match ls with
| hd :: tl -> (inc hd) :: increment_all tl
| [] -> []
let rec parity ls =
match ls with
| hd :: tl -> (par hd) :: parity tl
| [] -> []
We can define a general function map
that receives two parameters:
a function f
that can transform one element at a time, such as inc
or par
,
and a list ls
, which we want to manipulate:
let rec map f ls =
match ls with
| hd :: tl -> (f hd) :: map f tl
| [] -> []
The full type signature of the function:
It means that the function takes two parameters: a function that maps |
Then increment_all
and parity
are specific cases of map
and can be implemented as:
let inc x = x + 1
let par x = x mod 2 <> 0
let increment_all ls = map inc ls
let parity ls = map par ls
Alternatively, we can pass anonymous functions for the first argument, achieving the same result:
let increment_all ls = map (fun x -> x + 1) ls
let parity ls = map (fun x -> x mod 2 <> 0) ls
Also see a Stack Overflow question on how to implement a tail recursive map
,
using rev
function that reverses the accumulated list.
Filter
Function filter
is a generalization of a function like
keep_even [1; 2; 3; 4; 5] --> [2; 4]
If the condition holds (number is even), you keep the element, and discard it otherwise:
let rec keep_even ls =
match ls with
| hd :: tl ->
if hd mod 2 = 0 then
hd :: keep_even tl
else
keep_even tl
| [] -> []
A general filtering function that requires a predicate function f
that tests whether the elements should be kept or not:
let rec filter f ls =
match ls with
| hd :: tl ->
if f hd then
hd :: filter f tl
else
filter f tl
| [] -> []
The type signature of the function is
The first argument is the predicate |
In programming, a function that returns |
Usage examples:
filter (fun x -> x > 0) [10; -15; 12; 13]
--> [10; 12; 13] (* keep positive *)
filter (fun x -> x mod 2 = 0) [10; -15; 12; 13]
--> [10; 12] (* keep even *)
Iter
If we want to perform some side-effect computation on each element of the list, for example print it out:
let rec print ls =
match ls with
| hd :: tl ->
Printf.printf "%i " hd;
print tl
| [] -> ()
# print [10; 30; 50; 28; 17] ;; 10 30 50 28 17 - : unit = ()
This computation is a specific case of the function iter
:
let rec iter f ls =
match ls with
| hd :: tl ->
f hd;
iter f tl
| [] -> ()
The type signature of the function is
The first argument is a function |
A usage example:
let print_one_pair (x,y) =
Printf.printf "%i * %i = %i\n" x y (x*y)
let () =
iter print_one_pair [(2,5); (3,-7); (11, 11)]
Prints out:
2 * 5 = 10 3 * -7 = -21 11 * 11 = 121
Exercise
Transform the string "12 -5 8 100 -12"
into the list of integers keeping only positive numbers:
positive "12 -5 8 100 -12" --> [12; 8; 100]
Some functions you can use:
-
String.split_on_char c s
that splits the strings
on characterc
, returning a list of strings, -
int_of_string s
that converts the strings
into integer
The same solution written in a more explicit way: 🐤 click to open
let positive s =
filter (fun x -> x > 0) (map int_of_string (String.split_on_char ' ' s))
let positive s =
let ls = String.split_on_char ' ' s in
let nums = map int_of_string ls in
filter (fun x -> x > 0) nums
Module List
Explore the functions in the module List. We already know many of them:
-
List.length
-
List.append
-
List.flatten
-
List.rev
-
List.mem
(we called itcontains
) -
List.map
-
List.filter
-
List.iter
There are more functions in this module, spend time exploring it.
You are ready, and please feel free, to use any of the functions from the module List.