Unary operations

Today we’ll work on extending the language supported by our compiler with a couple of unary operations. In particular:

We’ll also learn a bit more about the semantics of x86-64 assembly.

Adding operations

Here’s the compiler we ended up with after last class:

let compile (program : s_exp) : string =
  match program with
  | Num n ->
      String.concat "\n"
        [
          "global _entry"; "_entry:"; Printf.sprintf "\tmov rax, %d" n; "\tret";
        ]
  | e -> raise (BadExpression e)

First, let’s go over exactly what it is that the code this compiler produces does right now. See lecture capture for details.

Before we extend the compiler to support new operations, we’re going to make a couple of changes. First: right now, the compiler will only work on MacOS, not Linux; on Linux, labels shouldn’t have underscores, whereas on MacOS they must. Relatedly: right now, it’s sort of hard to read! The mov and ret instructions both have tab characters, there’s that sprintf call, and so on.

Just as we introduced a more usable representation for s-expressions, we’ll use an OCaml type for assembly language as well. See the lecture capture for the details of how our Asm library works.

Using the Asm library, we can rewrite our compiler:

open Asm

(* some definitions elided... *)

let compile (exp: s_exp) : string =
  match exp with
  | Num n ->
    [Global "entry"; Label "entry"; Mov (Reg Rax, Imm n); Ret] 
    |> List.map string_of_directive |> String.concat "\n"
  | _ -> raise (BadExpression exp)

This code uses the “pipeline” operator |>. This operator gives us a cleaner way of calling functions in OCaml; x |> f is exactly the same as f x. It lets us write expressions from “inner” to “outer” rather than the other way around; the pipeline expression in the code above could be replaced with

String.concat "\n" (List.map string_of_directive [Global "entry"; Label "entry"; Mov (Reg Rax, Imm n); Ret])

Notice the other thing we’re doing here: partial application. We can provide some of the arguments to an OCaml function and get back a function we can apply to the rest of the arguments. For instance, since

List.map : ('a -> 'b) -> 'a list -> 'b list

we know that

List.map string_of_directive : directive list -> string list

OK: now we’re ready to add support for add1 and sub1. These are both unary operations: they take one argument. Each of them takes in an integer and returns an integer. (Integers are still the only type our language supports.) We’re starting with unary operations because we’ll be able to implement them using only a single register (namely our friend rax).

Let’s start with the add1 operation. What code should our compiler produce for, say, (add1 50)?. How about something like this:

global entry
entry:
        mov rax, 50
        add rax, 1
        ret

So, let’s write some code. We know we’re going to need to match against these new operations.

let compile (exp: s_exp) : string =
  match exp with
  | Num n ->
    [Global "entry"; Label "entry"; Mov (Reg Rax, Imm n); Ret] 
    |> List.map string_of_directive |> String.concat "\n"
  | Lst [Sym "add1"; arg] ->
    (* what should we put here? *) 
  | _ -> raise (BadExpression exp)

Once we’ve done that, we get a bit stuck. We’re going to need to include the “global entry”, “entry:”, and “ret” directives again, which seems repetitive. We’re also going to need to do something with the argument to our operation.

Let’s reorganize our compiler one more time:

let compile_exp (exp: s_exp) : directive list =
  match exp with
  | Num n ->
    [Mov (Reg Rax, Imm n)]
  | Lst [Sym "add1"; arg] ->
    (* what should we put here? *) 
  | _ -> raise (BadExpression exp)

let compile (program : s_exp) : string =
  [Global "entry"; Label "entry"] @
  compile_exp program @
  [Ret] |> List.map string_of_directive |> String.concat "\n"

We now have a helper function compile_exp in which we don’t need to worry about “boilerplate” like our label and our return instruction. Now, we can complete our support for add1:

let rec compile_exp (exp: s_exp) : directive list =
  match exp with
  | Num n ->
    [Mov (Reg Rax, Imm n)]
  | Lst [Sym "add1"; arg] ->
    compile_exp arg @
    [Add (Reg Rax, Imm 1)]
  | _ -> raise (BadExpression exp)

In the add1 case, we first recursively compile the argument to the operation. We don’t know what it is–it could be a number, or another unary operation. Regardless, we know what our compiler is going to do: it’s going to produce a sequence of instructions that will put the value of that expression in rax.

We can add sub1 support similarly:

let rec compile_exp (exp: s_exp) : directive list =
  match exp with
  | Num n ->
    [Mov (Reg Rax, Imm n)]
  | Lst [Sym "add1"; arg] ->
    compile_exp arg @
    [Add (Reg Rax, Imm 1)]
  | Lst [Sym "sub1"; arg] ->
    compile_exp arg @
    [Sub (Reg Rax, Imm 1)]
  | _ -> raise (BadExpression exp)

OK–this looks reasonable! But how do we know it’s right?

Think about this for the next couple days, and we’ll talk about it next session!