Interacting with the environment

Up until this point, our compiler has had a curious property. Here’s another compiler we could have written for our language:

```let rec compile_value (stack_index : int) (v : Interp.value) =
match v with
| Number n ->
[Mov (Reg Rax, operand_of_num n)]
| Boolean b ->
[Mov (Reg Rax, operand_of_bool b)]
| Pair (v1, v2) ->
compile_value stack_index v1
@ [Mov (stack_address stack_index, Reg Rax)]
@ compile_value (stack_index - 8) v2
@ [ Mov (Reg R8, stack_address stack_index)
; Mov (MemOffset (Reg Rdi, Imm 0), Reg R8)
; Mov (MemOffset (Reg Rdi, Imm 8), Reg Rax)
; Mov (Reg Rax, Reg Rdi)
; Or (Reg Rax, Imm pair_tag)
; Add (Reg Rdi, Imm 16) ]

let compile (program : s_exp) : string =
[Global "entry"; Label "entry"]
@ compile_value (-8) (Interp.interp_exp Symtab.empty program)
@ [Ret]
|> List.map string_of_directive
|> String.concat "\n"
```

(Go ahead and put it into our usual compile.ml file to see that it works.)

That’s quite a bit simpler! It’s even a better compiler–the programs it outputs are short and execute very efficiently. Given that it has been much easier to write our interpreter than to write our compiler, why not just do this? Why does anyone work on compilers?

The answer, of course, is that most programs don’t just compute the answer to a fixed expression! We usually write programs because we want to do the same operation multiple times on different inputs. However, in our current language, there’s no way to get an input from the outside world. Let’s fix that now!

```let input_channel = ref stdin

let rec interp_exp env (exp : s_exp) : value =
match exp with
Number (input_line !input_channel |> int_of_string)
(* ... *)
```

Simple enough! We get slightly weird behavior if we do this, though (at least on my machine):

```> interp "(pair (read-num) (read-num))";;
1
2
(pair 2 1)
```

What’s going on here? As it turns out, the problem isn’t with `read-num`–it’s with `pair`!

```let rec interp_exp env (exp : s_exp) : value =
match exp with
(* ... *)
| Lst [Sym "pair"; e1; e2] ->
Pair (interp_exp env e1, interp_exp env e2)
(* ... *)
```

We’re calling `interp_exp` twice, and each of those ends up reading input. But it seems like the second one is happening first!

The order in which OCaml evaluates arguments to functions (in this case, the function that constructs a pair) is actually unspecified: the implementation can evaluate them in whatever order it likes. This often doesn’t really matter–for pure expressions like most of the ones we’ve been dealing with, it doesn’t matter when they are evaluated! Now, though, we’ve introduced a side-effect: reading input. It really does matter now when those calls to `interp_exp` get evaluated!

Our compiler evaluates arguments to binary operations like `pair` in left to right order. We’ll do the same in the interpreter:

```let rec interp_exp env (exp : s_exp) : value =
match exp with
(* ... *)
| Lst [Sym "pair"; e1; e2] ->
let l = interp_exp env e1 in
let r = interp_exp env e2 in
Pair (l, r)
(* ... *)

let interp (program : string) : string =
interp_exp Symtab.empty (parse program)

```

First, we’ll add a function to the C runtime to read a number:

```uint64_t read_num() {
int r;
scanf("%d", &r);
return (uint64_t)(r) << num_shift;
}
```

We’ll need to let the assembler know that the `read_num` label is defined in the runtime:

```let compile (program : s_exp) : string =
[ Global "entry"
; Extern "error"
; Label "entry" ]
@ compile_exp Symtab.empty (-8) program
@ [Ret]
|> List.map string_of_directive
|> String.concat "\n"

```

Now we need to actually compile the `(read-num)` form into a call to this function. In our error-handling code, we were able to “call” C functions just by jumping to the right label. Fundamentally that’s still what’s going to happen, but we’re going to need to do some additional work to make sure our program can keep executing after the function call:

• `read_num` might use the `rdi` register where our heap pointer lives. We’ll need to save the heap pointer to the stack before `read_num` starts executing.
• Speaking of the stack, we’ve been writing to stack locations like `[rsp - 8]`, `[rsp - 16]`, etc. `read_num` might also want to write to `[rsp - 8]`, which will be a problem since we might have data there! The solution is to adjust `rsp` and restore it after the call.
• We haven’t modified memory at `[rsp + 0]` because that’s where our function’s return address is–when we call `ret`, the processor resumes operations at that address. Instead of calling `read_num` using `jmp`, we’ll use `call`, which will put the new return address on the top of the stack.
• Finally: C functions on x86-64 require that when they start executing, `rsp` (the location of the return address) is a multiple of 16. So, we’ll need to further adjust `rsp` to make sure this property holds.

Here’s the code we end up with:

```let align_stack_index (stack_index : int) : int =
if stack_index mod 16 = -8 then stack_index else stack_index - 8

let rec compile_exp (tab : int symtab) (stack_index : int) (exp : s_exp) :
directive list =
match exp with