Important: These notes are not designed to subsitute for class sessions. Importantly, they do not cover all content required for the exams or for implementing HWs. These are just an aid for re-implementing the class compiler, after you've already come to class and completed the in-class activities!


Last time we added input — read-num — and now we’re ready to add output! Here’s what we’re adding today:

We’ll be making one other big change. Up until this point, we’ve always had our compiler and interpreter print the value we get from evaluating the whole program. Now that we have a way for the programmer to control what gets printed and when, we’ll stop printing the final value by default. Just remember that this has changed! It’s easy to forget this and get confused about why you’re not getting output. Remember to add print as needed!

Adding output to the interpreter

As usual, we’ll start with our interpreter, and we’ll take advantage of OCaml’s own features to implement ours. In this case, we use output_string to print both the argument to print and the newline for newline.

Our print and newline expressions have to evaluate to something, but what? Here we’ve chosen Boolean true.

let output_channel = ref stdout

let rec interp_exp env (exp : s_exp) : value =
  match exp with
  | Lst (Sym "do" :: exps) when List.length exps > 0 ->
      exps |> List.rev_map (interp_exp env) |> List.hd
  | Lst [Sym "print"; e] ->
      interp_exp env e |> string_of_value |> output_string !output_channel ;
      Boolean true
  | Lst [Sym "newline"] ->
      output_string !output_channel "\n" ;
      Boolean true

Notice that we use rev_map to implement do. This works because rev_map evalutes expressions in the input list from left to right. Now that we have side effects, it’s important to know the order in which we’re making our recursive calls to interp_exp!

One other note—we’ve used an output_channel ref rather than directly using stdout because it’s going to make our differential testing infrastructure easier in a moment.

Adding output to the compiler

Next let’s start adjusting our compiler. First things first—let’s get rid of that old print_value call that used to wrap our call to entry!

We’ll also want a C function for print_newline, just like we added last time for read-num. Fortunately we don’t have to add any fresh C code for print. After all, we already have print_value for printing out values!

int main(int argc, char **argv) {
  void *heap = (void *)malloc(4096);
  return 0;

void print_newline() {

First let’s tell the assembler that it can count on us to provide print_value and print_newline via our Extern directives.

let compile (program : s_exp) : string =
  [ Global "entry"
  ; Extern "error"
  ; Extern "read_num"
  ; Extern "print_value"
  ; Extern "print_newline"
  ; Label "entry" ]
  @ compile_exp Symtab.empty (-8) program
  @ [Ret]
  |> string_of_directive
  |> String.concat "\n"

And now we can go ahead and start using print_value and print_newline. This looks a lot like when we called our runtime’s read_num function. There are a couple interesting differences though. First, our read-num expression could rely on our runtime’s read_num function to put the appropriate value in rax. This time we’ll have to handle that ourselves. We’ll just put our runtime representation of true into rax.

We have one more obstacle to handle. Unlike the other C functions we’ve called from our assembly code, print_value takes an argument! Fortunately we already know where the C calling convention expects to find a function’s first argument. We took advantage of this information to get access to our heap address. Recall that we find our heap address in rdi because we passed a pointer to the heap as the first argument into the call to entry. So we know C functions are going to look for the first argument in rdi. For our print, we’ll first save our current heap address on the stack, then copy the contents of rax into rdi, so that print_value can find it.

let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) :
    directive list =
  match exp with
  | Lst (Sym "do" :: exps) when List.length exps > 0 ->
      List.concat_map (compile_exp tab stack_index) exps
  | Lst [Sym "print"; e] ->
      compile_exp tab stack_index e
      @ [ Mov (stack_address stack_index, Reg Rdi)
        ; Mov (Reg Rdi, Reg Rax)
        ; Add (Reg Rsp, Imm (align_stack_index stack_index))
        ; Call "print_value"
        ; Sub (Reg Rsp, Imm (align_stack_index stack_index))
        ; Mov (Reg Rdi, stack_address stack_index)
        ; Mov (Reg Rax, operand_of_bool true) ]
  | Lst [Sym "newline"] ->
      [ Mov (stack_address stack_index, Reg Rdi)
      ; Add (Reg Rsp, Imm (align_stack_index stack_index))
      ; Call "print_newline"
      ; Sub (Reg Rsp, Imm (align_stack_index stack_index))
      ; Mov (Reg Rdi, stack_address stack_index)
      ; Mov (Reg Rax, operand_of_bool true) ]

Updating the difftest infrastructure

Now we’re ready to update our differential testing infrastrcuture. We’ll make two big changes here. We’ll start comparing all input that the compiler writes to the output channel with all input that the interpreter writes to the output channel. We’ll also make it easy to test programs with pre-supplied input data.

Don’t bother diving into interp_io. It’s just making it easy for us to pipe input into our programs and grab output out of them. What’s interesting for us is that interp_err now accepts a new argument, input, which is the “user input” we’ll feed to the program.

let interp (program : string) : unit =
  interp_exp Symtab.empty (parse program) |> ignore

let interp_io (program : string) (input : string) =
  let input_pipe_ex, input_pipe_en = Unix.pipe () in
  let output_pipe_ex, output_pipe_en = Unix.pipe () in
  input_channel := Unix.in_channel_of_descr input_pipe_ex ;
  set_binary_mode_in !input_channel false ;
  output_channel := Unix.out_channel_of_descr output_pipe_en ;
  set_binary_mode_out !output_channel false ;
  let write_input_channel = Unix.out_channel_of_descr input_pipe_en in
  set_binary_mode_out write_input_channel false ;
  let read_output_channel = Unix.in_channel_of_descr output_pipe_ex in
  set_binary_mode_in read_output_channel false ;
  output_string write_input_channel input ;
  close_out write_input_channel ;
  interp program ;
  close_out !output_channel ;
  let r = input_all read_output_channel in
  input_channel := stdin ;
  output_channel := stdout ;

let interp_err (program : string) (input : string) : string =
  try interp_io program input with BadExpression _ -> "ERROR"

Next let’s tackle the compiler. We’ll make the same changes, adding compile_and_run_io, which handles submitting input and accepting output. And compile_and_run_err now accepts the new input parameter.

let compile_and_run_io (program : string) (input : string) : string =
  compile_to_file program ;
  ignore (Unix.system "nasm program.s -f macho64 -o program.o") ;
  ignore (Unix.system "gcc program.o runtime.c -o program") ;
  let inp, outp = Unix.open_process "./program" in
  output_string outp input ;
  close_out outp ;
  let r = input_all inp in
  close_in inp ; r

let compile_and_run_err (program : string) (input : string) : string =
  try compile_and_run_io program input with BadExpression _ -> "ERROR"

One last change. Previously, difftest took a list of strings as input, with each string representing a program. Now we’ll want it to take a list of (string * string) pairs, where the first string represents the program and the second string represents the input we’ll feed to the program. And we’ll update our calls to compile_and_run_err and interp_err to pass in those inputs. And of course, we’ll replace our old list of string examples with (string * string) examples. Here, our one example is the program (print (read-num)), and we feed it input 1, so the test program will print 1.

let difftest (examples : (string * string) list) =
  let results =
      (fun (ex, i) -> (compile_and_run_err ex i, Interp.interp_err ex i))
  List.for_all (fun (r1, r2) -> r1 = r2) results

let test () = difftest [("(print (read-num))", "1")]