Discussion 10: Interpreters, Macros

disc10.pdf

This is an online worksheet that you can work on during discussions. Your work is not graded and you do not need to submit anything. The last section of most worksheets is Exam Prep, which will typically only be taught by your TA if you are in an Exam Prep section. You are of course more than welcome to work on Exam Prep problems on your own.

Calculator

An interpreter is a program that understands other programs. Today, we will explore how to build an interpreter for Calculator, a simple language that uses a subset of Scheme syntax.

The Calculator language includes only the four basic arithmetic operations: +, -, *, and /. These operations can be nested and can take any numbers of arguments. A few examples of calculator expressions and their corresponding values are shown below.

calc> (+ 2 2)
4

calc> (- 5)
-5

calc> (* (+ 1 2) (+ 2 3))
15

The reader component of an interpreter parses input strings and represents them as data structures in the implementing language. In this case, we need to represent Calculator expressions as Python objects. To represent numbers, we can just use Python numbers. To represent the names of the arithmetic procedures, we can use Python strings (e.g. '+').

Call expressions are a bit more complicated. Like Scheme call expressions, Calculator call expressions look just like Scheme lists. For example, to construct the expression (+ 2 3) in Scheme, we would do the following:

scm> (cons '+ (cons 2 (cons 3 nil)))
(+ 2 3)

To represent Scheme lists in Python, we will use the Pair class. A Pair instance holds exactly two elements. Accordingly, the Pair constructor takes in two arguments, and to make a list we must nest calls to the constructor and pass in nil as the second element of the last pair. Note that in our implementation, nil is bound to a special user-defined object that represents an empty list, whereas nil in Scheme is actually an empty list.

>>> Pair('+', Pair(2, Pair(3, nil)))
Pair('+', Pair(2, Pair(3, nil)))

Each Pair instance has two instance attributes: first and rest, which are bound to the first and second elements of the pair respectively.

>>> p = Pair('+', Pair(2, Pair(3, nil)))
>>> p.first
'+'
>>> p.rest
Pair(2, Pair(3, nil))
>>> p.rest.first
2

Pair is very similar to Link, the class we developed for representing linked lists -- they have the same attribute names first and rest and are represented very similarly. Here's an implementation of what we described:

class Pair:
    """Represents the built-in pair data structure in Scheme."""
    def __init__(self, first, rest):
        self.first = first
        if not scheme_valid_cdrp(rest):
            raise SchemeError("cdr can only be a pair, nil, or a promise but was {}".format(rest))
        self.rest = rest

    def map(self, fn):
        """Maps fn to every element in a list, returning a new
        Pair.

        >>> Pair(1, Pair(2, Pair(3, nil))).map(lambda x: x * x)
        Pair(1, Pair(4, Pair(9, nil)))
        """
        assert isinstance(self.rest, Pair) or self.rest is nil, \
            "rest element in pair must be another pair or nil"
        return Pair(fn(self.first), self.rest.map(fn))

    def __repr__(self):
        return 'Pair({}, {})'.format(self.first, self.rest)

class nil:
    """Represents the special empty pair nil in Scheme."""
    def map(self, fn):
        return nil
    def __getitem__(self, i):
        raise IndexError('Index out of range')
    def __repr__(self):
        return 'nil'

nil = nil() # this hides the nil class *forever*

Questions

Q1: From Pair to Calculator

Write out the Calculator expression with proper syntax that corresponds to the following Pair constructor calls. Also, draw out a box and pointer diagram corresponding to each input.

>>> Pair('+', Pair(1, Pair(2, Pair(3, Pair(4, nil)))))

>>> Pair('+', Pair(1, Pair(Pair('*', Pair(2, Pair(3, nil))), nil)))

Q2: Using Pair

Answer the following questions about a Pair instance representing the Calculator expression (+ (- 2 4) 6 8).

Write out the Python expression that returns a Pair representing the given expression:

Draw a box and pointer diagram corresponding to the Pair:

.
.
.
.

(Unfortunately, our website does not yet support drawing box-and-pointer diagrams, but you or your TA may wish to draw in the above space with Zoom or other annotation software)

What is the operator of the call expression?

If the Pair you constructed in the previous part was bound to the name p, how would you retrieve the operator?

What are the operands of the call expression?

If the Pair you constructed was bound to the name p, how would you retrieve a list containing all of the operands?

How would you retrieve only the first operand?

Evaluation

The evaluation component of an interpreter determines the type of an expression and executes corresponding evaluation rules.

Here are the evaluation rules for the three types of Calculator expressions:

Numbers are self-evaluating, and can be thought of as primitives. For example, the numbers 3.14 and 165 just evaluate to themselves.
Names are looked up in the OPERATORS dictionary. In this dictionary, each name (e.g. '+') is bound to a corresponding function in Python that does the appropriate operation on a list of numbers (e.g. sum).
Call expressions are evaluated the same way you've been doing them all semester:
1. Evaluate the operator, which evaluates to a function.
2. Evaluate the operands from left to right.
3. Apply the function to the value of the operands.

The function calc_eval takes in a Calculator expression represented in Python and implements each of these rules:

def calc_eval(exp):
    """Evaluates a Calculator expression represented as a Pair."""
    if isinstance(exp, Pair): # Call expressions
        fn = calc_eval(exp.first)
        args = list(exp.rest.map(calc_eval))
        return calc_apply(fn, args)
    elif exp in OPERATORS:    # Names
        return OPERATORS[exp]
    else:                     # Numbers
        return exp

calc_eval is recursive! In order to evaluate call expressions, we must call calc_eval on the operator and each of the operands.

The apply step in the Calculator language is straight-forward since we only have primitive procedures. This step will be more complex in the Scheme project since the procedures may include user-defined procedures.

Given the Python function that implements the appropriate Calculator operation and a Python list of numbers, the calc_apply function simply calls the function on the arguments, and regular Python evaluation rules take place.

def calc_apply(fn, args):
    """Applies a Calculator operation to a list of numbers."""
    return fn(args)

Questions

Q3: Counting Eval and Apply

How many calls to calc_eval and calc_apply would it take to evaluate each of the following Calculator expressions?

> (+ 2 4 6 8)

> (+ 2 (* 4 (- 6 8)))

Q4: New Operators

Suppose we want to add handling for comparison operators >, <, and = as well as and expressions to our Calculator interpreter. These should work the same way they do in Scheme.

calc> (and (= 1 1) 3)
3
calc> (and (+ 1 0) (< 1 0) (/ 1 0))
#f

i. Are we able to handle expressions containing the comparison operators (such as <, >, or =) with the existing implementation of calc_eval? Why or why not?

ii. Are we able to handle and expressions with the existing implementation of calc_eval? Why or why not?

Hint: Think about the rules of evaluation we've implemented in calc_eval. Is anything different about and?

iii. Now, complete the implementation below to handle and expressions. You may assume the conditional operators (e.g. <, >, =, etc) have already been implemented for you.

Run in 61A Code

Python Programs as Data

Imagine if we wanted to write a function in Python that would create and evaluate the following line of code:

[<expr> for i in range(5)]

Where we could pass in any arbitrary expression <expr>, that would then be evaluated 5 times and have the results listed. This is the kind of problem that macros were made for!

A macro is a procedure that operates on unevaluated code. In the example above, <expr> is not necessarily an object, but could be any piece of code, such as print("Hello"), [j ** 2 for j in range(i)], or i >= 5. For these expressions it is important to our problem statement that they not be evaluated until after they have been inserted into our list comprehension, either because they have side effects that will be apparent in the global frame, or because they depend on i in some way (there can be other reasons too!). So, we need a procedure that, instead of manipulating objects, manipulates code itself. This is where macros come in.

Unfortunately for us, Python does not have the ability to make true macros. Let's see what happens if we try to solve this problem with a traditional function.

def list_5(expr):
	return [expr for i  in range(5)]

>>> lst = list_5(print(10))
10
>>> lst
[None, None, None, None, None]

This isn't quite what we want. Because of Python evaluation rules, instead of evaluating a list of 5 print(10) statements, our expr was evaluated before the function was ever called, meaning 10 was only printed once.

The issue here is order of evaluation---we don't want our expression parameter to be evaluated until after it is inserted into our list comprehension. Even though we don't have the ability to make real macros when writing Python, we can write a function in the spirit of macros by returning a string representing the return expression, and evaluating this string after it has been returned.

def list_5(expr):
	return f"[{expr} for i  in range(5)]"

>>> lst = eval(list_5("print(10)"))
10
10
10
10
10
>>> lst
[None, None, None, None, None]

Now we see the number 10 printed 5 times as a side effect of our function, just like we want! We circumvented Python's evaluate-operands-before-evaluating-function body rule by passing in our desired expression as a string. Then, after we constructed our list comprehension in string form and we've returned it, we used the eval function to force evaluation of the output after the last step in the execution of this function. We were able to write code that took code as a parameter and restructured it in the form of new code!

Note: Although this is a cool hack we can do with Python, its usage is unfortunately rather limited. The eval function will throw a SyntaxError if it is passed any "compound" blocks such as if: ..., while: ..., for: ..., etc. (this does not include list comprehensions, or one-line if-statements such as 1 if a > b else 0.)

Q5: Make Lambda Python

Write a "macro" in Python called make_lambda that takes in two parameters: params, a string containing a comma-separated list of variable names; and body, a Python expression in string form, and returns a string that, when evaluated, will return a lambda function that takes params as its parameters and body as its body.

A string with an f (which is called an f-String) is like a quasiquoted expression in Scheme, and anything in that string in braces will be treated as though it has been unquoted. Try to write make_lambda using this syntax!

For an example of f-Strings, the following two pieces of code are equivalent:

>>> x = 2
>>> "The value of x is: " + x
'The value of x is 2'
>>> f"The value of x is {x}"
'The value of x is 2'

Hint: Try constructing the string that would define the lambda, and return that.

Run in 61A Code

Q6: If "Macro" Python

In this problem, we will write a "macro" in Python called if_macro that takes in three arguments:

condition: a string that will evaluate to a truth-y or false-y value
true_result: a string that will be evaluated and returned if condition is truth-y
false_result: a string that will be evaluated and returned if condition is false-y.

and returns a string that, when evaluated, will return the result of this if function.

if_macro should only evaluate true_result if condition is truth-y, and will only evaluate false_result if condition is false-y.

Hint: You can write a one-line if statement with the following syntax: <value_when_true> if <condition> else <value_when_false>

Hint: Using f-Strings might make this problem easier too!

Run in 61A Code

Quasiquoting

scm> (define a 1)
a
scm> '(cons a nil)
(cons a nil)

Recall that the quote special form prevents the Scheme interpreter from executing a following expression. We saw that this helps us create complex lists more easily than repeatedly calling cons or trying to get the structure right with list. It seems like this form would come in handy if we are trying to construct complex Scheme expressions with many nested lists.

Consider that we rewrite the twice macro as follows:

(define-macro (twice f)
  '(begin f f))

This seems like it would have the same effect, but since the quote form prevents any evaluation, the resulting expression we create would actually be (begin f f), which is not what we want.

The quasiquote allows us to construct literal lists in a similar way as quote, but also lets us specify if any sub-expression within the list should be evaluated.

At first glance, the quasiquote (which can be invoked with the backtick ` or the quasiquote special form) behaves exactly the same as ' or quote. However, using quasiquotes gives you the ability to unquote (which can be invoked with the comma , or the unquote special form). This removes an expression from the quoted context, evaluates it, and places it back in.

scm> `(cons a nil)
(cons a nil)
scm> `(cons ,a nil)
(cons 1 nil)

By combining quasiquotes and unquoting, we can often save ourselves a lot of trouble when building macro expressions.

Questions

Q7: Writing Quasiquote Expressions

Given that the following code has been executed in your Scheme environment:

scm> (define a +)
a
scm> (define b 1)
b
scm> (define c 2)
c

Write out the expressions using quasiquote, unquote, a, b, c, and parentheses to fill in the blanks that would lead to each of the following being displayed:

For example,

scm> ___________
(a b c)

Answer:

`(a b c)

a) Fill in the following blank:

scm> __________________
(a 1 c)

b) Fill in the following blank:

scm> __________________
3

c) Fill in the following blank:

scm> __________________
(a (b 1) c)

d) Fill in the following blank:

scm> __________________
(a 3 c)

Additional Practice Now, let the following be defined:

scm> (define condition '(= 1 1))
condition
scm> (define if-true '(print 3))
if-true
scm> (define if-false '(print 5))
if-false

Write the Scheme expression using the variables condition, if-true, and if-false along with quasiquoting, that would evaluate to the following:

scm> ________________________
(if (= 1 1) (print 3) (print 5))

Macros

Previously, we've seen how we can create macro-like functions in Python using f-strings. Now let's talk about real macros, in Scheme. So far we've been able to define our own procedures in Scheme using the define special form. When we call these procedures, we have to follow the rules for evaluating call expressions, which involve evaluating all the operands.

Scheme has special form expressions (if, lambda, let, etc.) that do not follow the evaluation rules of call expressions. Instead, each special form has its own rules of evaluation, which may include not evaluating all the operands. Wouldn't it be cool if we could define our own special forms where we decide which operands are evaluated? Consider the following example where we attempt to write a function that evaluates a given expression twice:

scm> (define (twice f) (begin f f))
twice
scm> (twice (print 'woof))
woof

Since twice is a regular procedure, a call to twice will follow the same rules of evaluation as regular call expressions; first we evaluate the operator and then we evaluate the operands. That means that woof was printed when we evaluated the operand (print 'woof). Inside the body of twice, the name f is bound to the value undefined, so the expression (begin f f) does nothing at all!

We have a problem here: we need to prevent the given expression from evaluating until we're inside the body of the procedure. This is where the define-macro special form, which has identical syntax to the regular define form, comes in:

scm> (define-macro (twice f) (list 'begin f f))
twice

define-macro allows us to define what's known as a macro, which is simply a way for us to combine unevaluated input expressions together into another expression. When we call macros, the operands are not evaluated, but rather are treated as Scheme data. This means that any operands that are call expressions or special form expression are treated like lists.

If we call (twice (print 'woof)), f will actually be bound to the list (print 'woof) instead of the value undefined. Inside the body of define-macro, we can insert these expressions into a larger Scheme expression. In our case, we would want a begin expression that looks like the following:

(begin (print 'woof) (print 'woof))

As Scheme data, this expression is really just a list containing three elements: begin and (print 'woof) twice, which is exactly what (list 'begin f f) returns. Now, when we call twice, this list is evaluated as an expression and (print 'woof) is evaluated twice.

scm> (twice (print 'woof))
woof
woof

To recap, macros are called similarly to regular procedures, but the rules for evaluating them are different. We evaluated lambda procedures in the following way:

Evaluate operator
Evaluate operands
Apply operator to operands, evaluating the body of the procedure

However, the rules for evaluating calls to macro procedures are:

Evaluate operator
Apply operator to unevaluated operands
Evaluate the expression returned by the macro in the frame it was called in.

Questions

Q8: If Macro Scheme

Now let's try to write out the if macro in Scheme! There are quite a few similarities between Python and Scheme, but we do have to make a few adjustments when converting our code over to Scheme. We'll start out by writing a Scheme function using the define form we use for normal functions.

First, let's consider the following questions:

In the Python If macro, we returned a string that, when evaluated, would execute an if statement with the correct parameters. In Scheme, we won't return a string, but rather we'll return something else that represents an unevaluated expression. What type will we return for Scheme? Here, what "type" refers to what data type -- i.e. function, list, integer, string, etc..

In the Python If macro, our parameters were all strings representing the condition, return value if true, and return value if false. What type will each of our parameters be in Scheme? Give an example of an acceptable parameter for the condition.

Let's start writing this out!

Write a function if-function using the define form (not the define-macro form), which will take in the following parameters:

condition : a quoted expression which will evaluate to the condition in our if expression
if-true : a quoted expression which will evaluate to the value we want to return if true
if-false : a quoted expression which will evaluate to the value we want to return if false

and returns a Scheme list representing the expression that, when evaluated, will evaluate to the result of our if expression.

Here are some doctests to show this:

scm> (if-function '(= 0 0) '2 '3)
(if (= 0 0) 2 3)
scm> (eval (if-function '(= 0 0) '2 '3))
2
scm> (if-function '(= 1 0) '(print 3) '(print 5))
(if (= 1 0) (print 3) (print 5))
scm> (eval (if-function '(= 1 0) '(print 3) '(print 5)))
5

Run in 61A Code

That felt a bit overly complicated just to create a function that emulates the if special form. We had to quote parameters, and we had to do an extra call to eval at the end to actually get our answer. To make things easier, we can use the define-macro special form to simplify this process.

As a reminder, the define-macro form changes the order of evaluation to be:

Evaluate operator
Apply operator to unevaluated operands
Evaluate the expression returned by the macro in the frame it was called in.

As a comparison, here are differences between define-macro and define:

The operands are not evaluated immediately. Instead, we can think of the operands as being quoted, similar to what we did in the previous part.
The return value gets evaled at the very end, after the entire return expression is constructed. This means that we no longer have to call eval on the return value of our if-function.

Now, use the define-macro special form to define a macro, if-macro, that will have functionality shown by the following doctests (notice how the operands are no longer quoted, and that the final eval is gone):

scm> (if-macro (= 0 0) 2 3)
2
scm> (if-macro (= 1 0) (print 3) (print 5))
5

Run in 61A Code

Q9: Or Macro

Implement or-macro, which takes in two expressions and or's them together (applying short-circuiting rules). However, do this without using the or special form. You may also assume the name v1 doesn't appear anywhere outside this macro. For a quick reminder on the short-circuiting rules for or take a look at slide 18 of Lecture 3 on Control.

The behavior of the or macro procedure is specified by the following doctests:

scm> (or-macro (print 'bork) (/ 1 0))
bork
scm> (or-macro (= 1 0) (+ 1 2))
3

Run in 61A Code

Q10: Replicate

Write a macro that takes an expression expr and a number n and repeats the expression n times. For example, (repeat-n expr 2) should behave the same as (twice expr) from the introduction section of this worksheet. It's possible to pass in a combination as the second argument (e.g. (+ 1 2)) as long as it evaluates to a number. Be sure that you evaluate this expression in your macro so that you don't treat it as a list.

Complete the implementation for repeat-n so that its behavior matches the doctests below.

scm> (repeat-n (print '(resistance is futile)) 3)
(resistance is futile)
(resistance is futile)
(resistance is futile)

scm> (repeat-n (print (+ 3 3)) (+ 1 1))  ; Pass a call expression in as n
6
6

You may find it useful to implement the replicate procedure, which takes in a value x and a number n and returns a list with x repeated n times. We've provided some examples for how replicate should function here:

scm> (replicate '(+ 1 2) 3)
((+ 1 2) (+ 1 2) (+ 1 2))

scm> (replicate (+ 1 2) 1)
(3)

scm> (replicate 'hi 0)
()

Hint: Feel free to check out Discussion 11 and copy over your implementation of replicate!

Hint 2: Consider which method to build your list (list, cons, or quasiquote) might help make your implementation simpler.

Run in 61A Code

Q11: When Macro

Using macros, let's make a new special form, when, that has the following structure:

(when <condition>
      (<expr1> <expr2> <expr3> ...))

If the condition is not false (a truthy expression), all the subsequent operands are evaluated in order and the value of the last expression is returned. Otherwise, the entire when expression evaluates to okay.

scm> (when (= 1 0) ((/ 1 0) 'error))
okay

scm> (when (= 1 1) ((print 6) (print 1) 'a))
6
1
a

Run in 61A Code

Exam Prep

Q12: Fixing Scheme with Infix Notation

Adapted From Summer 2018 Final Q8 and Fall 2020 Examprep 8 Q3

Important: Scroll down for the function signatures, skeletons, and doctests for all parts.

Difficulty: ⭐

Part A: First, write the helper function skip, which skips the first n items in a list, returning the rest. For full credit, your solution must be tail recursive. You may assume that n is non-negative.

scm> (skip 2 '(1 2 3 4))
(3 4)
scm> (skip 10 '(1 2 3 4))
()

Difficulty: ⭐⭐⭐

Part B: One annoying thing about Scheme is that it can only understand arithmetic operations that are written in prefix notation. That is, if I want to evaluate an expression, the arithmetic operator must come first, which is different than in math class.

Let’s leverage our skills to define a Scheme procedure infix that accepts arithmetic operations with infix notation, which places operators between operands as you are used to. You only need to support the addition and multiplication operators * and +, but you need to support order of operations.

HINT: You may find it useful to use skip, but it is not required!

scm> (infix '(1 + 2))
3
scm> (infix '(1 * 2))
2
scm> (infix '(3 + 2 * 5 + 4))
17
scm> (infix '(1 + 2 * (3 + 4)))
15

Difficulty: ⭐⭐

Part C: Update your infix scheme interpreter such that it works with names as well as numeric literals.

HINT: You will need to modify the given code!

scm> (define x 3)
scm> (infix '(x + 2))
5
scm> (infix '(1 * x))
3
scm> (infix '((x + x) * (x + x)))
36

Note: The skeleton code is just a suggestion; feel free to use your own structure if you prefer.

Run in 61A Code

Q13: Slice It!

Difficulty: ⭐⭐

Implement the get-slicer procedure, which takes integers a and b and returns an a-b slicing function. An a-b slicing function takes in a list as input and outputs a new list with the values of the original list from index a (inclusive) to index b (exclusive).

Your implementation should behave like Python slicing, but should assume a step size of one with no negative slicing indices. Indices start at zero.

Note: the skeleton code is just a suggestion. Feel free to use your own structure if you prefer.

Run in 61A Code

Q14: Find It!

Difficulty: ⭐⭐

Implement the find-in-tree procedure, which takes a Scheme tree t, a target value goal and returns a list containing the node values on a path from the root of t to a node containing goal. If no such path exists, find-in-tree returns an empty list.

Complete find-in-branches for ease of implementation. find-in-branches takes a list of branches bs and a value goal and returns a list containing the node values on a path from any branch to a node containing goal. If no such path exists, find-in-branches returns an empty list.

The Scheme tree ADT is specified at the top of the skeleton code.

Note: The skeleton code is just a suggestion; feel free to use your own structure if you prefer.

Run in 61A Code

Q15: Partition Options

Difficulty: ⭐⭐

First, write the procedure combine, which takes in lists lst1 and lst2, as well as a value x, and outputs a new list with x before each item of lst1, and then all elements of lst2 unmodified. For examples, see the doctests.

Then write the procedure partition-options, which takes in positive integers total and biggest and outputs a list of lists, in which each inner list contains numbers no larger than biggest that sum to total.

Note: The skeleton code is just a suggestion; feel free to use your own structure if you prefer.

Run in 61A Code