Correctness and testing

If we take the long view, programmers have–to put it gently–a pretty mixed track record vis-a-vis bugs; writing bug-free software is really hard! At the same time, it’s quite important to ensure that compilers are correct. If your compiler has a bug, it could impact anyone writing programs in your language and is likely to be very hard for end-users to track down. So: what can we do to make sure that our software is correct?

We have a number of options, all of which have been applied (to varying degrees of success) to compilers:

For this class, our approach is going to be as follows. As we grow our language, we’re going to add features both to our compiler and to a definitional interpreter. This is an interpreter for our language where we don’t care about performance at all; all we care about is that it’s correct. Indeed, this interpreter will serve as the definition of our language.

Here’s such an interpreter for the language we have so far:

let rec interp_exp (exp : s_exp) : int =
  match exp with
  | Num n -> n
  | Lst [Sym "add1"; arg] ->
    (interp_exp arg) + 1
  | Lst [Sym "sub1"; arg] ->
      (interp_exp arg) - 1
  | _ -> raise (BadExpression exp)

let interp (program : string) = string_of_int (interp_exp (parse program))

Notice that our interpreter has a pretty similar structure to our compiler. We’ll see this throughout the semester.

Now that we have an interpreter, we can test our compiler against it:

let difftest (examples : string list) =
  let results = List.map (fun ex -> (compile_and_run ex, interp ex)) examples in
  List.for_all (fun (r1, r2) -> r1 = r2) results

let test () =
  difftest ["43"; "(add1 (add1 3))"; "(sub1 4)"]