Let’s get started with a code example
When learning a new programming language, it’s usually very helpful to see some example code in that language.
factorial.c - C program
|
factorial.ml — OCaml equivalent
|
How to compile and run the program?
$ ocamlopt -o factorial factorial.ml
$ ./factorial 3628800
The compilation command above uses the native code compiler ocamlopt
. There is also a byte code compiler ocamlc
, which is useful if you want to
debug your code, and an interactive REPL ocaml
, which is usually referred as toplevel. There is also a new community-developed toplevel that provides more
features called utop
.
|
interactive toplevel (REPL) (official documentation) |
|
byte code compiler |
|
native code compiler |
|
a new more feature-rich toplevel (REPL) |
In the beginning, we will be using a toplevel, ocaml
or utop
, for running our code interactively, and later we will switch to compilation.
However, it is good to know from the start that all of these options are available for building and running your programs.
Moreover, there is also a possibility to compile OCaml to JavaScript with js_of_ocaml
and BuckleScript compilers, and although discussing them in depth would probably go beyond the scope of this class,
you might want to use them in the future.
There are many powerful tools in the modern OCaml toolchain. For this course, in addition to the compilers, we will also be using the following programs: The textbook also describes how to use |
Expressions
Open terminal and start an interactive OCaml toplevel session.
(We use a Unix tool rlwrap
to make it more user-friendly, alternatively, if you have installed utop
, you can use it instead of the official toplevel,
then you don’t need rlwrap
.)
$ rlwrap ocaml
or
$ utop
We can use toplevel as a calculator, for instance here we compute several arithmetic expressions:
OCaml version 4.07.1
# 1 + 2 ;;
- : int = 3
# 10 + 3 * 5 ;;
- : int = 25
# (10 + 3) * 5 ;;
- : int = 65
Notice that we need to put double semicolon ;;
after each expression to tell the program process our input. This double semicolon
is not required in actual programs.
|
Think of a functional program as an expression, like (1 + 2) * (20 / 3)
, which gets evaluated
the same way as arithmetic expressions get computed in mathematics:
(1 + 2) * (20 / 3)
--> (3) * (20 / 3)
--> (3) * (6)
--> 18
There is no statements in functional programming, only expressions. Even if then else
conditions are expressions:
# 50 + (if 1 < 2 then 200 else 300) * 5 ;;
- : int = 1050
Parentheses are used to control the order of expression evaluation. Compare to:
# 50 + if 1 < 2 then 200 else 300 * 5 ;;
- : int = 250
# if 1 < 2 then
(if 100>20 then 5 else 10) + 15
else
145 ;;
- : int = 20
All comparison operators return a value of type |
In OCaml, there are also operators Instead, use |
Name binding
We define x
equal to 123
and then define a function square
that returns its argument squared:
# let n = 123 ;;
val n : int = 123
# n + n ;;
- : int = 246
# let square x = x * x ;;
val square : int -> int = <fun>
# square 12 ;;
- : int = 144
# square n ;;
- : int = 15129
Define a function max_of_two
that takes two arguments and returns the one that is larger:
# let max_of_two x y =
if x > y then
x
else
y ;;
val max_of_two : 'a -> 'a -> 'a = <fun>
# max_of_two 11 222 ;;
- : int = 222
# max_of_two 17 15 ;;
- : int = 17
# square (max_of_two 10 25) ;;
- : int = 625
Operator Precedence
As a rule of thumb, remember that in OCaml, the order of expression evaluation is as follows:
Because of that,
For details, see the operator precedence and associativity table in the reference manual. |
Keeping declarations in a file
Instead of typing declarations in the command line, let’s put them in a file, which we can call, for example, square_max.ml
:
let n = 123
let square x = x * x
let max_of_two x y =
if x > y then
x
else
y
Now, open toplevel again. The program we just wrote can be loaded in the toplevel using directive #use
:
$ rlwrap ocaml
OCaml version 4.07.1
# #use "square_max.ml" ;;
val n : int = 123
val square : int -> int = <fun>
val max_of_two : 'a -> 'a -> 'a = <fun>
# n + 50 ;;
- : int = 173
# square 12 ;;
- : int = 144
# square (5 + 2) ;;
- : int = 49
# max_of_two (square 5) (-4) ;;
- : int = 25
Type inference in OCaml
Why the type of the function square
is int → int
,
whereas for the function max_of_two
, it is 'a → 'a → 'a
?
In OCaml, types are inferred by the compiler if you don’t specify them yourself. The type inference algorithm is using the fact that each previously defined operator or function already has a type.
Since the multiplication operator *
has the type int → int → int
(takes two integers and returns an integer),
the square
function must have the type int → int
to conform to the type of *
.
On the other hand, the >
operator is polymorphic, which means it can compare not only integers, but also floats, strings, and any other
compound data types. The polymorphism of the >
operator is expressed in its type: 'a → 'a → bool
(here, 'a
is a type variable, which indicates that the operator arguments can be of any type).
Because of that, the type inference algorithm deduces that the function max_of_two
is also polymorphic,
and has the type 'a → 'a → 'a
.
Exercises
max_of_three x y z
-
the maximum of three arguments
is_even x
-
check if the argument is even
is_odd x
-
check if the argument is odd
|
is_leap_year y
-
check if
y
is a leap year. The logic of the decision is as follows:
if (year is not divisible by 4) then (it is a common year)
else if (year is not divisible by 100) then (it is a leap year)
else if (year is not divisible by 400) then (it is a common year)
else (it is a leap year)
Recursive declarations
Recursive function declarations must start with let rec
instead of simple let
:
let rec factorial n =
if n <= 1 then
1
else
factorial (n-1) * n
Exercises
sum_range a b
The sum of all integers in the range from a
to b
(that is, for all a
≤ x
≤ b
).
Example: sum_range 1 5
should be equal to 1 + 2 + 3 + 4 + 5 = 15
.
🐤 click to open
let rec sum_range a b =
if a > b then
0 (* base case: empty range *)
else
sum_range a (b-1) + b
prod_range a b
The product of all integers in the range from a
to b
🐤 click to open
let rec prod_range a b =
if a > b then
1
else
prod_range a (b-1) * b
num_digits x
The number of (decimal) digits in the integer x
. (Should work correctly for all integers.)
🐤 click to open
let rec num_digits n =
if abs n < 10 then
1
else
num_digits (n/10) + 1
sum_digits x
The sum of all (decimal) digits of x
Without using function 🐤 click to open
let rec sum_digits n =
if abs n < 10 then
abs n
else
sum_digits (n/10) + abs (n mod 10)
abs
:let rec sum_digits n =
if n < 0 then
sum_digits (-n)
else
if n < 10 then
n
else
sum_digits (n/10) + (n mod 10)
nth_digit i x
i
-th (decimal) digit of the integer x
. Assume digits are indexed left-to-right, starting at i
=1.
The function should return 0 for invalid values of i
.
Examples:
nth_digit 1 7025 = 7
nth_digit 2 7025 = 0
nth_digit 3 7025 = 2
nth_digit 4 7025 = 5
nth_digit 1 0 = 0
nth_digit 1 (-7025) = 7
nth_digit 3 (-7025) = 2
nth_digit (-1) 7025 = 0
nth_digit 0 7025 = 0
nth_digit 5 7025 = 0
binom n k
Binomial coefficient C(n,k), implemented recursively using Pascal’s identity.
Since this mathematical function is normally defined only for 0 ≤ k ≤ n,
make the function return zero for other values of n
and k
.
This extension of the definition is consistent with Pascal’s identity and the meaning of the coefficient
(for instance, there is 0 ways to select 10 items out of 9 available).
Examples:
binom 5 0 = 1
binom 5 1 = 5
binom 5 2 = 10
binom 5 100 = 0
A simpler solution that does not handle impossible cases of, e.g. k < 0 or k > n: A more complex solution that handles impossible cases: Or: 🐤 click to open
(* (n, k) = (n-1,k-1) + (n-1,k) *)
let rec binom n k =
if k = 0 then
1 (* Left edge of Pascal's tringle *)
else if k = n then
1 (* Right edge of Pascal's triangle *)
else
binom (n-1) (k-1) + binom (n-1) k
let rec binom n k =
if k < 0 || k > n then 0
else if k = 0 || k = n then 1
else binom (n-1) (k-1) + binom (n-1) k
let rec binom n k =
if k < 0 || k > n then 0
else if k = 0 && n = 0 then 1
else binom (n-1) (k-1) + binom (n-1) k
Functions are values. Anonymous functions
The following two function definitions are equivalent:
let square1 x = x * x
let square2 = (fun x -> x * x)
Functions are values, just like numbers.
(fun x -> x * x)
is a valid expression for a square function, and we save it in the variable square2
.
Example:
# let square2 = (fun x -> x * x) ;;
val square2 : int -> int = <fun>
# square2 12 ;;
- : int = 144
Moreover, we can apply (fun x -> x * x)
directly to an argument, and it will return that argument squared:
# (fun x -> x*x) 7 ;;
- : int = 49
Such functions that are created without giving them a name are called anonymous functions.
# (fun a b -> a < b) 100 200 ;;
- : bool = true
# (fun a b c -> a + b + c) 10 20 30 ;;
- : int = 60
# 5 + (fun a b c -> a + b + c) 10 20 30 * 2 ;;
- : int = 125
The computational model of functional programming (which is called lambda calculus) is based on simple substitution:
you plug in arguments, then simplify. Repeat until you are done
(you still have to pay attention to the order of evaluation, just like in arithmetic, *
goes before +
and -
,
and so on, according to the operator precedence order).
The example above gets evaluated as follows:
5 + (fun a b c -> a + b + c) 10 20 30 * 2
--> 5 + (10 + 20 + 30) * 2
--> 5 + 60 * 2
--> 5 + 120
--> 125
Since anonymous functions cannot refer to themselves by name, it seems that they should not be able to establish recursive computation on their own. This is not actually the case. It is possible to use so called fixed-point combinators, such as Y combinator, or a self-application U combinator to implement recursion using only anonymous functions. This is not a very practical approach for implementing recursion, but it does have some use cases (1) (2) besides its theoretical value in lambda calculus. A minimal example with Y combinator:
You can try to evaluate this expression on paper by doing substitutions. You will see that it works. |