Variant types
Many programming tasks deal with heterogeneous data
-
Directions: Left, Right, Up, Down.
-
Numbers that can be either an Int or a Float.
-
Geometric shapes: Square, Circle, Triangle, etc. Each such shape can have different properties (radius for the circle, but width and height for a rectangle).
-
Colors in various color models (RGB, CMYK, etc.)
-
UI events or possible actions available to a user.
-
Various map landmarks.
-
Types of food.
-
etc.
From a programming language, we need:
-
Convenient means for declaring such heterogeneous data types.
-
The ability to define operations on these data types.
-
Type-checking by the compiler.
In C, one can use A type for storing four directions:
A type for storing either
See this link on how union and struct types can be used to implement a variant type in C.
|
In object-oriented programming, one can use classes for similar purposes.
For example, a parent class |
Introducing variant types
In functional programming, the situations described above can be modeled with variant types.
A type representing four directions and a simple example of its usage:
type dir = Left | Right | Up | Down
let () =
let dir = Left in
match dir with
| Left -> print_endline "Going left"
| Right -> print_endline "Going right"
| _ -> print_endline "Going up or down"
A function reversing direction:
let reverse d =
match d with
| Left -> Right
| Right -> Left
| Up -> Down
| Down -> Up
A geometric shape type, and a function area
that computes the area of a shape:
type shape =
Square of float (** width *)
| Rectangle of (float*float) (** width and height *)
| Circle of float (** radius *)
let area sh =
match sh with
| Square w -> w *. w
| Rectangle (w,h) -> w *. h
| Circle r ->
let pi = 4.0 *. atan 1.0 in
pi *. r *. r
A variant number type that stores int
or float
values:
type number = Int of int | Float of float
let () =
let v = Int 15 in
match v with
| Int x -> Printf.printf "It is int %i\n" x
| Float x -> Printf.printf "It is float %g\n" x
A function that adds two variant numbers defined above. It automatically promotes int
to float
when necessary:
let sum n1 n2 =
match n1, n2 with
| Int x, Int y -> Int (x+y)
| Int x, Float y -> Float (float x +. y)
| Float x, Int y -> Float (x +. float y)
| Float x, float y -> Float (x +. y)
We are using function |
The option type is also a variant type. Its values can be either None
or Some v
, where v
is any OCaml values. It is defined as follows:
type 'a option = None | Some of 'a
Here, |
A more complex example with geometric shapes
Let us implement a condition that checks whether or not a shape can fit into another shape:
(** outer radius (circumradius) of a shape *)
let outer_radius sh =
match sh with
| Circle r -> r
| Square w -> w /. sqrt(2.0)
| Rectangle (w,h) -> 0.5 *. sqrt (w*.w +. h*.h)
(** inner radius *)
let inner_radius sh =
match sh with
| Circle r -> r
| Square w -> 0.5 *. w
| Rectangle (w,h) -> 0.5 *. (min w h)
(** [fits x y] if shape [x] can fit inside [y] *)
let rec fits x y =
match x, y with
| Square w1, Square w2 -> w1 <= w2
| Circle r, _ -> r <= inner_radius y
| _, Circle r -> outer_radius x <= r
| Rectangle (w1,h1), Rectangle (w2,h2) ->
w1 <= w2 && h1 <= h2 ||
w1 <= h2 && h1 <= w2
(* not counting possible diagonal placement *)
| Rectangle _, Square w -> fits x (Rectangle (w, w))
| Square w, Rectangle _ -> fits (Rectangle (w, w)) y
Note that rectangles can also fit diagonally, with the condition described here, however we only consider orthogonal placement of rectangles in our code.
Example: A calculator language (without input parsing)
type expr =
Add of expr * expr
| Sub of expr * expr
| Mul of expr * expr
| Div of expr * expr
| V of int
let rec eval e =
match e with
| V x -> x
| Add (e1, e2) -> eval e1 + eval e2
| Sub (e1, e2) -> eval e1 - eval e2
| Mul (e1, e2) -> eval e1 * eval e2
| Div (e1, e2) -> eval e1 / eval e2
let () =
let e = Mul (V 5, Sub (V 10, V 4)) in (* 5*(10-4) *)
Printf.printf "%i \n" (eval e) (* would print 30 *)
This type declarations is recursive: type |
Pretty printer for calculator expressions (the function should return a fully-parenthesized string corresponding to the expression).
string_of_expr (V 100) --> "100"
string_of_expr (Add (V 7, V 8))) --> "(7 + 8)"
string_of_expr (Mul (V 5, Sub (V 10, V 4))) --> "(5 * (10 - 4))"
🐤 click to open
let rec string_of_expr e =
match e with
| V x -> string_of_int x
| Add (x, y) -> "(" ^ string_of_expr x ^ " + " ^ string_of_expr y ^ ")"
| Sub (x, y) -> "(" ^ string_of_expr x ^ " - " ^ string_of_expr y ^ ")"
| Mul (x, y) -> "(" ^ string_of_expr x ^ " * " ^ string_of_expr y ^ ")"
| Div (x, y) -> "(" ^ string_of_expr x ^ " / " ^ string_of_expr y ^ ")"
Functional languages with algebraic datatypes (tuples and variants) are particularly well-suited for implementing other programming languages. See more on this topic, for example, here:
|
Reading a file
The following program reads the file file.txt
from the current folder and prints it out on the screen:
(** [get_line ic] reads a line from the input channel [ic] *)
let get_line ic =
try
Some (input_line ic)
with
| End_of_file -> None
let () =
(* open input channel *)
let ic = open_in "file.txt" in
(* read and print the file content line by line *)
let rec read () =
match get_line ic with
| Some line ->
Printf.printf "%s\n" line;
read ()
| None -> ()
in
read ();
(* close input channel *)
close_in ic
MiniMake
We are going to implement a small utility MiniMake that works similarly to Make and can be used for building programs with separate compilation.
Useful references:
MiniMakefile format
MiniMake will be reading the rules from the file called MiniMakefile
. This file looks a lot like a normal Makefile
(in fact, the most basic Makefiles will be compliant MiniMakefiles):
program : utils.o main.o
echo "#Linking#"
g++ -o program utils.o main.o
utils.o : utils.cpp utils.h # Comment
echo "#Compiling utils.o#"
g++ -c utils.cpp
main.o : main.cpp utils.h
echo "#Compiling main.o#"
g++ -c main.cpp
# Remove .o files and executable
clean :
rm *.o program
MiniMakefile |
is a sequence of rules |
rule |
consists one target line and 0 or more recipe lines |
target line |
consists of one target followed by a colon |
target |
string that does not contain whitespace or colon |
prerequisite |
string that does not contain whitespace |
recipe line |
starts with a whitespace ( |
comment |
starts with |
In the MiniMakefile
example above, the echo
commands are unnecessary for building the program, instead,
they demonstrate that a recipe can take several lines.
They also show that '#'
can occur in a recipe line and should not be discarded as a comment.
If we type minimake
, it will execute the commands for the first target from the MiniMakefile
(in the above example, the first target is called program
), building all the prerequisites along the way,
just like make
would do:
$ ./minimake echo "#Compiling utils.o#" #Compiling utils.o# g++ -c utils.cpp echo "#Compiling main.o#" #Compiling main.o# g++ -c main.cpp echo "#Linking#" #Linking# g++ -o program utils.o main.o
Parsing a MiniMakefile
When parsing a MiniMakefile, our first step is to scan every line and determine what type this line is:
type line =
Target of (string * (string list))
| Recipe of string
| Empty
| Error (* non-compliant, cannot be parsed *)
Target line (contains a target and a list of prerequisites), Recipe line (contains a shell command), Empty line, and Error line. We need the Error case for non-compliant lines that cannot be parsed.
Classifying a line:
-
If the line starts with a whitespace (
' '
or'\t'
) → recipe line -
Otherwise: Remove comments (
'#'
), and then:-
if
target : prerequisites
→ target line -
else if empty → empty line
-
else → error line
-
classify " g++ -o program utils.o main.o"
--> Recipe "g++ -o program utils.o main.o"
classify "program : utils.o main.o"
--> Target ("program", ["utils.o"; "main.o"])
classify "clean: #Some comment"
--> Target ("clean", [])
classify "clean"
--> Error
Let us implement this classify
function.
We are going to use regular expressions (module Str) for that.
let remove_comment line =
try
let i = String.index line '#' in
String.sub line 0 i
with
| Not_found -> line
let rx_recipe = Str.regexp "^[ \t]+\\(.*\\)$"
let rx_target = Str.regexp "^\\([^: \t]+\\)[ \t]*:\\(.*\\)$"
let rx_space = Str.regexp "[ \t]+"
let classify line =
if Str.string_match rx_recipe line 0 then
let command = Str.matched_group 1 line in
Recipe command
else begin
let line = String.trim (remove_comment line) in
if String.length line = 0 then
Empty
else if Str.string_match rx_target line 0 then
let tgt = Str.matched_group 1 line in
let prereqs = Str.matched_group 2 line in
let prereq_ls = Str.split rx_space prereqs in
Target (tgt, prereq_ls)
else
Error
end
Explanation of Str regular expressions:
-
[ \t]+
matches 1 or more whitespaces (' '
or'\t'
) -
.*
matches 0 or more characters of any kind. Used to match recipe commands and lists of prerequisites. -
\\(
and\\)
are brackets marking a matched groups we want to extract -
[^: \t]+
matches 1 or more character other than':'
,' '
, or'\t'
. Used to match targets -
^
at the front and$
at the end require matching from the beginning to the end of the line, so the entire line should match not just some part of it
The function can be tested in toplevel:
# #load "str.cma" ;; (* need to explicitly load Str library *)
# #use "minimake.ml" ;; (* load the function *)
# classify "program : utils.o main.o" ;;
- : line = Target ("program", ["utils.o"; "main.o"])
# classify " g++ -o program utils.o main.o" ;;
- : line = Recipe "g++ -o program utils.o main.o"
Let us write a simple program now that reads all lines from the file MiniMakefile
:
let get_line ic =
try
Some (input_line ic)
with
| End_of_file -> None
(* returns all lines from the file as a list of strings, arranged in the reverse order *)
let read_all filename =
let ic = open_in filename in
let rec read acc =
match get_line ic with
| Some line -> read (line :: acc)
| None ->
close_in ic; (* close input channel *)
acc (* return accumulator *)
in
read []
let () =
(* read the file and clasify each line *)
let ls = List.map classify (read_all "MiniMakefile") in
(* print how many lines is read *)
Printf.printf "%i lines read" (List.length ls)
If all the above declarations are combined in the file minimake.ml
, we can compile it as follows
(note the explicit mention of the library file str.cmxa
):
ocamlopt -o minimake str.cmxa minimake.ml
Alternatively, compiling with the byte-code compiler ocamlc
(need the byte-code library str.cma
):
ocamlc -o minimake str.cma minimake.ml
Alternatively, compiling with the ocamlfind
tool:
ocamlfind ocamlopt -o minimake -package str -linkpkg minimake.ml