Thunk and delayed computation
Delayed computation is a code put in an anonymous function:
(fun () -> ... )
Such anonymous function is commonly called thunk. It represents delayed computation, because the code it carries is not evaluated until you actually call it.
For example, one can define:
let t =
(fun () ->
let pi = 4.0 *. atan 1.0 in
let area = pi *. 15.0 *. 15.0 in
Printf.printf "log: computed area %g\n" area;
Some area
)
in
and then call it elsewhere:
t()
Sequence (Delayed functional iterator)
If you want to create a sequence of elements, but don’t want to generate the whole thing right away (maybe because it is too long, possibly even infinite, or because it involves a lot of computation), you can use lazy evaluation, or you can use delayed computation. We will consider delayed computation case.
Define two types, 'a seq
and 'a node
:
type 'a seq = unit -> 'a node
and 'a node =
Nil
| Cons of ('a * 'a seq)
These are mutually recursive types, so we use and
keyword to define them both simultaneously.
-
'a node
represents a node of our delayed sequence — (head and tail) or empty. -
'a seq
is the first node of the sequence stored as a delayed computation.
Iterate over a sequence of integers and print its elements:
let rec print seq =
match seq () with
| Cons (x, rest) ->
Printf.printf "%i " x;
print rest
| Nil -> ()
In the following examples we will use notation |
Infinite constant sequence
const n --> (n; n; n; ...)
let rec const n =
(fun () ->
Cons (n, const (n))
)
Infinite sequence of natural numbers (0; 1; 2; 3; …)
(not perfect, since it will eventually reach integer overflow):
nats --> (0; 1; 2; 3; ...)
🐤 click to open
let nats =
let rec seq_ints n =
(fun () ->
Cons (n, seq_ints (n+1))
)
in
seq_ints 0
Take n
first elements from seq
:
take 10 nats --> (0; 1; 2; 3; 4; 5; 6; 7; 8; 9)
🐤 click to open
let rec take n seq =
(fun () ->
if n > 0 then
match seq() with
| Cons (x, rest) -> Cons (x, take (n-1) rest)
| Nil -> Nil
else
Nil
)
Join two sequences
join (take 4 nats) (take 3 (const 0)) --> (0; 1; 2; 3; 0; 0; 0)
join nats nats |> take 5 --> (0; 1; 2; 3; 4)
(* note that joining two infinite sequence does not create infinite loop *)
🐤 click to open
let rec join seq1 seq2 =
(fun () ->
match seq1 () with
| Cons (x, rest) -> Cons(x, join rest seq2)
| Nil -> seq2 ()
)
Module Seq
In OCaml, delayed sequence type is implemented in the standard library in the module |
For example:
module M = Map.Make(String)
let () =
let fruits = M.empty
|> M.add "apple" 10
|> M.add "orange" 7
|> M.add "grapes" 18
|> M.add "pineapple" 41
|> M.add "cherry" 12
in
fruits
|> M.to_seq
|> Seq.iter (fun (k,v) ->
Printf.printf "%s %i\n" k v
);
print_newline();
fruits
|> M.to_seq
|> Seq.filter (fun (k,v) -> v > 15)
|> Seq.iter (fun (k,v) ->
Printf.printf "%s %i\n" k v
)
apple 10 cherry 12 grapes 18 orange 7 pineapple 41 grapes 18 pineapple 41
Note that the elements are sorted by the key (the fruit name). This is because the function M.to_seq
produces the sequence of its key-value pairs sorted by the key (in-order iteration).
Memoization
Memoized function caches its return values so that in the future, instead of recomputing it again, it simply looks up for the cached answer.
We will use hash tables (module
|
For example, cached Fibonacci numbers:
let fib =
(* make a hash table for caching *)
let ht = Hashtbl.create 10 in
Hashtbl.replace ht 0 0;
Hashtbl.replace ht 1 1;
let rec f n =
match Hashtbl.find_opt ht n with
| Some v -> v (* return answer it found *)
| None -> (* otherwise: *)
let v = f (n-1) + f (n-2) in (* compute *)
Hashtbl.replace ht n v; (* save *)
v (* return *)
in
f
fib 50 --> 12586269025
Higher-order functions for memoization
We can define higher order functions, mem
, and mem_rec
to turn any function into its memoized version.
Non-recursive memoization
let mem f =
let ht = Hashtbl.create 10 in
( fun x ->
try Hashtbl.find ht x with
| Not_found ->
let res = f x in
Hashtbl.replace ht x res;
res
)
Example: Memoized square roots of integer numbers:
let sqrt_of_int = mem (fun x -> sqrt (float x))
sqrt_of_int 123 --> 11.0905365064094177
Recursive memoization
let mem_rec f =
let ht = Hashtbl.create 10 in
let rec self =
( fun x ->
try Hashtbl.find ht x with
| Not_found ->
let res = f self x in
Hashtbl.replace ht x res;
res
)
in
self
Example: Memoized Fibonacci numbers:
let fib =
mem_rec
(fun self n ->
if n < 2 then n
else self (n-1) + self (n-2)
)
fib 50 --> 12586269025
Additional resources and reading:
-
Chapter 8.27
-
Real World OCaml: Imperative Programming: Memoization and Dynamic Programming
Monads
Consider the following program that reads the filename from the first command line argument, opens that file, and reads an integer number from it:
let get_arg n =
if Array.length Sys.argv <= n then None
else Some (Sys.argv.(n))
let open_file filename =
try
Some (open_in filename)
with
| Sys_error _ -> None
let input_int_opt chan =
try
let ib = Scanning.from_channel chan in
Some (Scanf.bscanf ib "%d" (fun x -> x))
with
| _ -> None
let read_int_from_file () =
match get_arg 1 with
| None -> None
| Some filename ->
begin match open_file filename with
| None -> None
| Some chan ->
begin match input_int_opt chan with
| None -> None
| Some int -> Some int
end
end
let () =
match read_int_from_file () with
| Some x -> Printf.printf "%i\n" x
| None -> Printf.printf "Error\n"
If the integer is read successfully, the program prints it out:
$ ./a.out good-file 1234
Otherwise, if something goes wrong (there is no command-line argument, the file does not exist,
or there is no integer in the file), the program prints Error
:
$ ./a.out bad-file Error
The program works fine, but the function read_int_from_file
contains repetitive pattern matchings, and that can be improved.
Define a helper wrapping type 'a m
It is implemented as an option type with two additional operations, return
and >>=
(also called "bind"):
type 'a m = 'a option
(*type: 'a -> 'a m *)
let return x = Some x
(*type: 'a m -> ('a -> 'b m) -> 'b m *)
let (>>=) am f =
match am with
| Some a -> f a
| None -> None
Using this type (called Maybe monad), the read_int_from_file
function reduces to a much cleaner program:
let read_int_from_file () =
get_arg 1 >>= open_file >>= input_int_opt
Or somewhat more verbosely:
let read_int_from_file () =
get_arg 1 >>= fun filename ->
open_file filename >>= fun chan ->
input_int_opt chan
According to monad laws,
|
Monads are very useful in pure languages such as Haskell, often for representing imperative computation (state, IO, etc). However, they are also employed in non-pure languages such as OCaml. Examples include (see link):
|
Additional resources and reading:
-
Chapters 8.22 - 8.26
Lambda calculus (Theoretical foundation of functional programming)
Lambda calculus is a model of computation introduced by Alonzo Church in 1930s. It describes computation based on anonymous functions and their application. It lays the foundation of functional programming and many other fields of theoretical computer science and mathematical logic.
Basics
Consider an OCaml anonymous function:
(fun x -> (fun y -> x y))
It takes argument x
, then argument y
, and then applies x
to y
.
If we apply this function to two functions (fun w → w)
and u
, we will evaluate the whole expression to u
:
(fun x -> (fun y -> x y)) (fun w -> w) u
--> (fun y -> (fun w -> w) y) u
--> (fun w -> w) u
--> u
If you understand how the above evaluation works, you understand how Lambda calculus works. It turns out, all computations can be expressed by using this type of function definition (abstraction) and function application (substitution).
To rewrite the OCaml program in the example above as a Lambda calculus term, we have to do simple syntactic substitution:
-
replace
fun
withλ
(greek lambda), and -
replace
→
with.
(period)
So we get:
(λ x . (λ y . x y))
Following the OCaml example above, we can apply this lambda term to (λ w . w)
and u
, and it will reduce to u
,
exactly the same way as the OCaml program did:
(λ x . (λ y . x y)) (λ w . w) u
→ (λ y . (λ w . w) y) u
→ (λ w . w) u
→ u
The interesting question is: How to express familiar computation primitives using only lambda terms (anonymous function and their application)?
Infinite loop
Consider the following lambda term (λ x . x x)
. If you apply it to any other term M
, you will get:
(λ x . x x) M
→ M M
Well, things get interesting if that term M
is (λ x . x x)
, so we get:
(λ x . x x) (λ x . x x)
→ (λ x . x x) (λ x . x x)
→ (λ x . x x) (λ x . x x)
→ … it will keep doing the same reduction indefinitely.
Such indefinite loop program cannot be run in OCaml, because the type checker will complain about the type of # (fun x -> x x) (fun x -> x x) ;; Error: This expression has type 'a -> 'b but an expression was expected of type 'a The type variable 'a occurs inside 'a -> 'b In untyped lambda calculus, this is not a problem, and |
Local name binding let in
as lambda term
Consider the following example of let-in name binding:
let x = y in
f x
I can be rewritten as
(fun x -> f x) y
So, the corresponding lambda term is:
(λ x . f x) y
Additional resources and reading:
-
Introduction to Functional Programming, John Harrison (1999)
-
Foundations of Functional Programming, Lawrence C Paulson (2000)
Functional Programming in other languages
Aspects of functional programming:
-
First-class functions: functions can be saved in a variable, passed as an argument, or returned from a function.
-
Referential transparency (pure functions): function producing no side-effects and returning the same result every time you call it.
-
Tail-call optimization
-
Immutable data: persistent data types
-
Pattern matching
-
Variant types: such as Option type
Other aspects that are important in functional programming, but not limited to functional paradigm:
-
Lexical scoping and closures
-
Garbage collection
Modern general-purpose languages such as Kotlin, Rust, Swift, Clojure, Scala use many functional programming concepts. Old programming languages, such as Java, and C++, acquire functional-programming features too.
JavaScript
-
Hey Underscore, You’re Doing It Wrong! - video on functional JS on why in functional code you put a collection argument last, currying, and point-free programming.
JavaScript was created as a mix of Scheme and Self with a Java-style syntax. It had first-class functions from the very beginning, however its type system is too permissive to develop safe and robust code. It resulted in development of better functional languages such as TypeScript and Clojure (ClojureScript) that compile into JS, and new libraries such as Immutable.js, lodash/fp, and Ramda to provide other functional programming features in JavaScript.
OCaml compiles to JS too, using compilers js_of_ocaml (also called JSOO) and BuckleScript. Programming language Reason ML is an alternative syntax for OCaml that is designed for better interoperability with JavaScript, relying on BuckleScript compiler.