Reflections On Binary Operations

A note about stack depth

Take a moment to work through this program in our language:

(+ (+ 1 (+ 2 3)) (+ 4 (+ 5 6)))

Given our approach for implementing addition in the compiler, what’s the biggest offset from the stack pointer at which we will store a value?

Once you’re ready to check your guess, go ahead and look inside program.s. What’s the largest offset in the assembly?

Think about why there’s only one value placed at this offset, how the fist and second operands are treated differently, and the difference between calling compile_exp with a given stack_index and actually placing a value at that stack_index. Why is there a difference?

A note about the stack pointer

For our purposes in compiler-land, we’re taking a simplified view of how the stack works–in particular, you may notice we’re not updating the stack pointer to prevent interrupts from clobbering our data. See lecture capture for a quick chat about how we’d change our binary operation approach to update the stack pointer and a note about why we won’t bother to do that bookkeeping.

A note about undefined behavior

What should this expression evaluate to?

(+ 1 false)

Our interpreter gives us the answer: just like a nonsense expression like (hello cs164), (+ 1 false) isn’t part of our language, so the interpreter raises an exception. What will our compiler do on this program?

$ compile_and_run "(+ 1 false)";;
"BAD 35"

Our runtime indicates that we’ve produced a bad value (specifically, 35)–it doesn’t correspond to anything in our tagging scheme. So, OK–the compiler and the interpreter both end up producing errors on this program.

How about this program?

(+ 32 false)

Our interpreter, of course, still throws an exception. But our compiler does something pretty weird:

$ compile_and_run "(+ 32 false)";;
"true"

Weird, right? It makes sense, though: since false is represented as 0b00011111 and 32 is represented as 0b10000000, false + 32 is 0b10011111, the runtime representation of true.

So: is this a bug in our compiler? Maybe not! (+ 32 false) isn’t a valid program in the language supported by our compiler. There are lots of these invalid programs! Different ones result in different things:

  • Programs like (hello cs164) result in an exception at compile time
  • Programs like (+ 1 false) result in a runtime error
  • Programs like (+ 32 false) result in a strange value

The behavior of our compiler on these programs is undefined. We can error our at compile-time, error out at runtime, produce a reasonable-looking value, or anything else. Some real-world programming languages include undefined behavior as part of the language standard; for instance, dereferencing a null pointer in C is undefined.

Many modern languages, however, eschew undefined behavior–as we have just seen, it’s quite confusing for programmers! A reasonable specification for programs like (+ 1 true) is that they should result in errors. In a couple of weeks, we will see how to add error-handling to our compiler and get rid of this strange behavior. For now, we’ll leave our compiler’s behavior specified only on valid expressions.