Handling errors
Right now, here’s how we’re defining correctness for our compiler:
For all programs p
, if the interpreter produces a value
when run on p
, the compiler produces machine code that
produces that same value.
But the interpreter doesn’t produce a value for every program! On
(add1 false)
, for instance, the interpreter throws an
exception.
For these programs, we’re currently making no claims about our
compiler’s behavior. Maybe it will return an error of some
kind–for instance, on (add1 false)
we get an error
from the runtime because it doesn’t know how to print the value.
On totally invalid programs like (hello hello)
our compiler
will raise the same error as our interpreter–we don’t know
how to compile programs like that.
But on some of these programs, our compiler will actually produce a
value (or really, produce a machine-code program that produces a value).
(add1 (sub1 false))
, for instance, produces
false
in the compiler even though the interpreter
doesn’t recognize it as a valid program.
Today, we’ll fix this issue, modifying our compiler to handle these errors.
Modifying the runtime
First, we’ll add an error-handling function to the runtime. We’ll call this function from our compiled programs when an error occurs.
void error() { printf("ERROR"); exit(1); }
As usual, we’ll need to recompile the runtime:
gcc -c runtime.c -o runtime.o
Modifying the compiler
First, we’ll need to modify our compiler’s output so
that we can call our new
error
function:
let compile (program : s_exp) : string = [Global "entry"; Extern "error"; Label "entry"] @ compile_exp Symtab.empty (-8) program @ [Ret] |> List.map string_of_directive |> String.concat "\n"
That Extern "error"
directive is sort of the inverse of
Global
: it tells the assembler that our program will be
linked against a program that includes a definition for the
error
label.
We’ll jump to this label whenever we want to signal an error
at runtime. For instance, add1
should raise an error if
its argument isn’t a number:
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with (* some cases elided ... *) | Lst [Sym "add1"; arg] -> compile_exp tab stack_index arg @ [ Mov (Reg R8, op) ; And (Reg R8, Imm num_mask) ; Cmp (Reg R8, Imm num_tag) ; Jnz "error" ] @ [Add (Reg Rax, operand_of_num 1)]
We raise an error by jumping to our error
function. In
general calling C functions will be more complex than this since we
want to preserve our heap pointer and values on our stack, but since
the error
function stops execution we don’t need
to worry about any of that.
We can extract these directives into a helper function:
let ensure_num (op : operand) : directive list = [ Mov (Reg R8, op) ; And (Reg R8, Imm num_mask) ; Cmp (Reg R8, Imm num_tag) ; Jnz "error" ]
(We should only call ensure_num
when we’re not
using the value in r8
!)
We can use this to add error handling to functions that should take numbers:
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with (* some cases elided ... *) | Lst [Sym "add1"; arg] -> compile_exp tab stack_index arg @ ensure_num (Reg Rax) @ [Add (Reg Rax, operand_of_num 1)] | Lst [Sym "+"; e1; e2] -> compile_exp tab stack_index e1 @ ensure_num (Reg Rax) @ [Mov (stack_address stack_index, Reg Rax)] @ compile_exp tab (stack_index - 8) e2 @ (ensure_num (Reg Rax) @ [Mov (Reg R8, stack_address stack_index)] @ [Add (Reg Rax, Reg R8)]
and so on. We can write a similar function for pairs:
let ensure_pair (op : operand) : directive list = [ Mov (Reg R8, op) ; And (Reg R8, Imm heap_mask) ; Cmp (Reg R8, Imm pair_tag) ; Jnz "error" ]
Compiler correctness revisited
We can now make a stronger statement about compiler correctness:
For all programs p
, if the interpreter produces a value
when run on p
, the compiler produces machine code that
produces that same value. If the interpreter produces an error, the
compiler will either produce an error or produce a program that
produces an error.
We can add support for erroring programs to our tester:
let interp_err (program : string) : string = try interp program with BadExpression _ -> "ERROR"
let compile_and_run_err (program : string) : string = try compile_and_run program with BadExpression _ -> "ERROR" let difftest (examples : string list) = let results = List.map (fun ex -> (compile_and_run_err ex, Interp.interp_err ex)) examples in List.for_all (fun (r1, r2) -> r1 = r2) results
We have one lingering problem: there are some programs that produce
an error in our compiler but not in our interpreter. An
example is (if true 1 (hello hello))
. Since the
interpreter never evaluates (hello hello)
, it happily
produces the value 1
. The compiler, however, will throw
an error at compile-time. We could fix this by adding a check to the
interpreter to ensure that the programs it’s trying to
interpret are well-formed (i.e., don’t contain
expressions like (hello hello)
) even if they
aren’t type-correct.