Output
Last time we added input — read-num
— and now
we’re ready to add output! Here’s what we’re adding
today:
-
(print e)
prints the value of expressione
. (newline)
prints a newline.-
(do e1 ... en)
evaluates expressionse1
throughen
and evaluates to the value of the last expression,en
. Now that we haveprint
andnewline
, which are useful mostly for their side effects, we might want to do some sequencing! We’ll usedo
for that.
We’ll be making one other big change. Up until this point,
we’ve always had our compiler and interpreter print the value we
get from evaluating the whole program. Now that we have a way for the
programmer to control what gets printed and when, we’ll stop
printing the final value by default. Just remember that this has
changed! It’s easy to forget this and get confused about why
you’re not getting output. Remember to add print
as
needed!
Adding output to the interpreter
As usual, we’ll start with our interpreter, and we’ll
take advantage of OCaml’s own features to implement ours. In
this case, we use output_string
to print both the
argument to print
and the newline for
newline
.
Our print
and newline
expressions have to
evaluate to something, but what? Here we’ve chosen
Boolean true
.
let output_channel = ref stdout let rec interp_exp env (exp : s_exp) : value = match exp with | Lst (Sym "do" :: exps) when List.length exps > 0 -> exps |> List.rev_map (interp_exp env) |> List.hd | Lst [Sym "print"; e] -> interp_exp env e |> string_of_value |> output_string !output_channel ; Boolean true | Lst [Sym "newline"] -> output_string !output_channel "\n" ; Boolean true
Notice that we use rev_map
to implement
do
. This works because rev_map
evalutes
expressions in the input list from left to right. Now that we have
side effects, it’s important to know the order in which
we’re making our recursive calls to interp_exp
!
One other note—we’ve used an
output_channel
ref rather than directly using
stdout
because it’s going to make our
differential testing infrastructure easier in a moment.
Adding output to the compiler
Next let’s start adjusting our compiler. First things
first—let’s get rid of that old
print_value
call that used to wrap our call to
entry
!
We’ll also want a C function for print_newline
,
just like we added last time for read-num
. Fortunately
we don’t have to add any fresh C code for print
.
After all, we already have print_value
for printing out
values!
int main(int argc, char **argv) { void *heap = (void *)malloc(4096); entry(heap); return 0; } void print_newline() { printf("\n"); }
First let’s tell the assembler that it can count on us to
provide print_value
and print_newline
via
our Extern
directives.
let compile (program : s_exp) : string = [ Global "entry" ; Extern "error" ; Extern "read_num" ; Extern "print_value" ; Extern "print_newline" ; Label "entry" ] @ compile_exp Symtab.empty (-8) program @ [Ret] |> List.map string_of_directive |> String.concat "\n"
And now we can go ahead and start using print_value
and
print_newline
. This looks a lot like when we called our
runtime’s read_num
function. There are a couple
interesting differences though. First, our
read-num
expression could rely on our runtime’s
read_num
function to put the appropriate value in
rax
. This time we’ll have to handle that
ourselves. We’ll just put our runtime representation of
true
into rax
.
We have one more obstacle to handle. Unlike the other C functions
we’ve called from our assembly code,
print_value
takes an argument! Fortunately we already
know where the C calling convention expects to find a
function’s first argument. We took advantage of this
information to get access to our heap address. Recall that we find
our heap address in rdi
because we passed a pointer to
the heap as the first argument into the call to entry
.
So we know C functions are going to look for the first argument in
rdi
. For our print
, we’ll first save
our current heap address on the stack, then copy the contents of
rax
into rdi
, so that
print_value
can find it.
let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) : directive list = match exp with | Lst (Sym "do" :: exps) when List.length exps > 0 -> List.concat_map (compile_exp tab stack_index) exps | Lst [Sym "print"; e] -> compile_exp tab stack_index e @ [ Mov (stack_address stack_index, Reg Rdi) ; Mov (Reg Rdi, Reg Rax) ; Add (Reg Rsp, Imm (align_stack_index stack_index)) ; Call "print_value" ; Sub (Reg Rsp, Imm (align_stack_index stack_index)) ; Mov (Reg Rdi, stack_address stack_index) ; Mov (Reg Rax, operand_of_bool true) ] | Lst [Sym "newline"] -> [ Mov (stack_address stack_index, Reg Rdi) ; Add (Reg Rsp, Imm (align_stack_index stack_index)) ; Call "print_newline" ; Sub (Reg Rsp, Imm (align_stack_index stack_index)) ; Mov (Reg Rdi, stack_address stack_index) ; Mov (Reg Rax, operand_of_bool true) ]
Updating the difftest
infrastructure
Now we’re ready to update our differential testing infrastrcuture. We’ll make two big changes here. We’ll start comparing all input that the compiler writes to the output channel with all input that the interpreter writes to the output channel. We’ll also make it easy to test programs with pre-supplied input data.
Don’t bother diving into interp_io
. It’s
just making it easy for us to pipe input into our programs and grab
output out of them. What’s interesting for us is that
interp_err
now accepts a new argument,
input
, which is the “user input”
we’ll feed to the program.
let interp (program : string) : unit = interp_exp Symtab.empty (parse program) |> ignore let interp_io (program : string) (input : string) = let input_pipe_ex, input_pipe_en = Unix.pipe () in let output_pipe_ex, output_pipe_en = Unix.pipe () in input_channel := Unix.in_channel_of_descr input_pipe_ex ; set_binary_mode_in !input_channel false ; output_channel := Unix.out_channel_of_descr output_pipe_en ; set_binary_mode_out !output_channel false ; let write_input_channel = Unix.out_channel_of_descr input_pipe_en in set_binary_mode_out write_input_channel false ; let read_output_channel = Unix.in_channel_of_descr output_pipe_ex in set_binary_mode_in read_output_channel false ; output_string write_input_channel input ; close_out write_input_channel ; interp program ; close_out !output_channel ; let r = input_all read_output_channel in input_channel := stdin ; output_channel := stdout ; r let interp_err (program : string) (input : string) : string = try interp_io program input with BadExpression _ -> "ERROR"
Next let’s tackle the compiler. We’ll make the same
changes, adding compile_and_run_io
, which handles
submitting input and accepting output. And
compile_and_run_err
now accepts the new
input
parameter.
let compile_and_run_io (program : string) (input : string) : string = compile_to_file program ; ignore (Unix.system "nasm program.s -f macho64 -o program.o") ; ignore (Unix.system "gcc program.o runtime.c -o program") ; let inp, outp = Unix.open_process "./program" in output_string outp input ; close_out outp ; let r = input_all inp in close_in inp ; r let compile_and_run_err (program : string) (input : string) : string = try compile_and_run_io program input with BadExpression _ -> "ERROR"
One last change. Previously, difftest
took a list of
strings as input, with each string representing a program. Now
we’ll want it to take a list of
(string * string)
pairs, where the first string
represents the program and the second string represents the input
we’ll feed to the program. And we’ll update our calls to
compile_and_run_err
and interp_err
to pass
in those inputs. And of course, we’ll replace our old list of
string
examples with
(string * string)
examples. Here, our one example is
the program (print (read-num))
, and we feed it input
1
, so the test program will print 1
.
let difftest (examples : (string * string) list) = let results = List.map (fun (ex, i) -> (compile_and_run_err ex i, Interp.interp_err ex i)) examples in List.for_all (fun (r1, r2) -> r1 = r2) results let test () = difftest [("(print (read-num))", "1")]