Lab 11: Interpreters, Scheme Data Abstraction

lab11.zip

Due by 11:59pm on Tuesday, August 1.

Starter Files

Download lab11.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the Ok autograder.

Topics

Consult this section if you need a refresher on the material for this lab. It's okay to skip directly to the questions and refer back here should you get stuck.

Interpreters

An interpreter is a program that allows you to interact with the computer in a certain language. It understands the expressions that you type in through that language, and performs the corresponding actions in some way, usually using an underlying language.

In Project 4, you will use Python to implement an interpreter for Scheme. The Python interpreter that you've been using all semester is written (mostly) in the C programming language. The computer itself uses hardware to interpret machine code (a series of ones and zeros that represent basic operations like adding numbers, loading information from memory, etc).

When we talk about an interpreter, there are two languages at work:

The language being interpreted/implemented. For Project 4, we are interpreting the Scheme language.
The underlying implementation language. For Project 4, we will implement an interpreter for Scheme using Python.

REPL

Many interpreters use a Read-Eval-Print Loop (REPL). This loop waits for user input, and then processes it in three steps:

Read: The interpreter takes the user input (a string) and passes it through a parser. The parser processes the input in two steps:
- The lexical analysis step turns the user input string into tokens that are like "words" of the implemented language. Tokens represent the smallest units of information.
- The syntactic analysis step takes the tokens from the previous step and organizes them into a data structure that the underlying language can understand. For our Scheme interpreter, we create a Pair object (similar to a Linked List) from the tokens to represent the original call expression.
  - The first item in the Pair represents the operator of the call expression. The subsequent items are the operands of the call expression, or the arguments that the operator will be applied to. Note that operands themselves can also be nested call expressions.

Below is a summary of the read process for a Scheme expression input:

And here is a summary of how syntactic analysis converts input tokens into Pair objects:

Eval: Mutual recursion between eval and apply evaluate the expression to obtain a value.
- eval takes an expression and evaluates it according to the rules of the language. Evaluating a call expression involves calling apply to apply an evaluated operator to its evaluated operands.
- apply takes an evaluated operator, i.e., a function, and applies it to the call expression's arguments. Apply may call eval to do more work in the body of the function, so eval and apply are mutually recursive.
Print: Display the result of evaluating the user input.

Here's how all the pieces fit together:

Required Questions

Getting Started Videos

These videos may provide some helpful direction for tackling the coding problems on this assignment.

To see these videos, you should be logged into your berkeley.edu email. Only the last 3 videos are relevant for this week's lab.

YouTube link

Calculator

An interpreter is a program that understands other programs. Today, we will explore how to build an interpreter for Calculator, a simple made-up language that uses a subset of Scheme syntax!

The Calculator language includes only the four basic arithmetic operations: +, -, *, and /. These operations can be nested and can take any numbers of arguments. A few examples of calculator expressions and their corresponding values are shown below.

Your task today is to write a Python program that can comphrehend inputs written in the Calculator language to produce the desired outputs (arithmetic operations)!

Note Hopefully, the sort of tasks you're handling today remind you those from the Scheme project. We've specifically set up this lab (handling a small subset of Scheme inputs) to introduce interpreter fundamentals that you will need in the project (handling all Scheme inputs)!

 calc> (+ 2 2)
 4

 calc> (- 5)
 -5

 calc> (* (+ 1 2) (+ 2 3))
 15

The reader component of an interpreter parses input strings from the base language and represents them as data structures in the implementing language. In this case, we need to represent Calculator expressions as Python objects.

To represent numbers, we can just use Python numbers.
To represent the names of the arithmetic procedures, we can use Python strings (e.g. '+').

Pair Class

To represent Scheme lists in Python, we will use the Pair class. A Pair instance holds exactly two elements. Accordingly, the Pair constructor takes in two arguments. To make a list we must nest calls to the constructor and pass in nil as the second element of the last pair.

Look familiar? The Pair is very similar to Link, the class we developed for representing linked lists -- they have the same attribute names first and rest and are represented very similarly!

Note In the Python code, nil is bound to a special user-defined object that represents an empty list, whereas nil in Scheme is actually an empty list.

For example, once our interpreter reads the Scheme expression (+ 2 3), we will tokenize each operator and operand to return the equivalent Pair representation, Pair('+', Pair(2, Pair(3, nil))).

Each Pair instance has two instance attributes: first and rest, which are bound to the first and second elements of the pair respectively.

>>> p = Pair('+', Pair(2, Pair(3, nil)))
>>> p.first
'+'
>>> p.rest
Pair(2, Pair(3, nil))
>>> p.rest.first
2

Here's an implementation of what we described:

class Pair:
    """Represents the built-in pair data structure in Scheme."""
    def __init__(self, first, rest):
        self.first = first
        if not scheme_valid_cdrp(rest):
            raise SchemeError("cdr can only be a pair, nil, or a promise but was {}".format(rest))
        self.rest = rest

    def map(self, fn):
        """Maps fn to every element in a list, returning a new
        Pair.

        >>> Pair(1, Pair(2, Pair(3, nil))).map(lambda x: x * x)
        Pair(1, Pair(4, Pair(9, nil)))
        """
        assert isinstance(self.rest, Pair) or self.rest is nil, \
            "rest element in pair must be another pair or nil"
        return Pair(fn(self.first), self.rest.map(fn))

    def __repr__(self):
        return 'Pair({}, {})'.format(self.first, self.rest)

class nil:
    """Represents the special empty pair nil in Scheme."""
    def map(self, fn):
        return nil
    def __getitem__(self, i):
         raise IndexError('Index out of range')
    def __repr__(self):
        return 'nil'

nil = nil() # this hides the nil class *forever*

Calculator Evaluation

For Question 2 (New Procedure) and Question 4 (Saving Values), you'll need to update the calc_eval function as listed below. This function performs the evaluations for each token with our Pairs. For instance, we've included the evaluations for numbers and booleans for you (they'll automatically be returned as is since they're atomic Scheme expressions). For Question 2, you'll be tasked to determine what the operator and operands for a funtion call in Scheme as well as actually applying the function in the calc_apply line. For Question 4, you'll be tasked to determing how to look up variables that we've previously stored.

def calc_eval(exp):
    """
    >>> calc_eval(Pair("define", Pair("a", Pair(1, nil))))
    'a'
    >>> calc_eval("a")
    1
    >>> calc_eval(Pair("+", Pair(1, Pair(2, nil))))
    3
    """
    if isinstance(exp, Pair):
        operator = ____________ # UPDATE THIS FOR Q2
        operands = ____________ # UPDATE THIS FOR Q2
        if operator == 'and': # and expressions
            return eval_and(operands)
        elif operator == 'define': # define expressions
            return eval_define(operands)
        else: # Call expressions
            return calc_apply(___________, ___________) # UPDATE THIS FOR Q2
    elif exp in OPERATORS:   # Looking up procedures
        return OPERATORS[exp]
    elif isinstance(exp, int) or isinstance(exp, bool):   # Numbers and booleans
        return exp
    elif _________________: # CHANGE THIS CONDITION FOR Q4
        return _________________ # UPDATE THIS FOR Q4

Q1: Using Pair

Answer the following questions about a Pair instance representing the Calculator expression (+ (- 2 4) 6 8).

Use Ok to test your understanding:

python3 ok -q using_pair -u

Q2: New Procedure

Suppose we want to add the // operation to our Calculator interpreter. Recall from Python that // is the floor division operation, so we are looking to add a built-in procedure // in our interpreter such that (// dividend divisor) returns dividend // divisor. Similarly we handle multiple inputs as illustrated in the following example (// dividend divisor1 divisor2 divisor3) evaluates to (((dividend // divisor1) // divisor2) // divisor3). For this problem you can assume you are always given at least 1 divisor.

Hint: You will need to modify both the calc_eval and floor_div methods for this question!

calc> (// 1 1)
1
calc> (// 5 2)
2
calc> (// 28 (+ 1 1) 1)
14

Hint: Make sure that every element in a Pair (the operator and all operands) will be calc_eval-uated once, so that we can correctly apply the relevant Python operator to operands! You may find the map method of the Pair class useful for this.

def floor_div(expr):
    """
    >>> floor_div(Pair(100, Pair(10, nil)))
    10
    >>> floor_div(Pair(5, Pair(3, nil)))
    1
    >>> floor_div(Pair(1, Pair(1, nil)))
    1
    >>> floor_div(Pair(5, Pair(2, nil)))
    2
    >>> floor_div(Pair(23, Pair(2, Pair(5, nil))))
    2
    >>> calc_eval(Pair("//", Pair(4, Pair(2, nil))))
    2
    >>> calc_eval(Pair("//", Pair(100, Pair(2, Pair(2, Pair(2, Pair(2, Pair(2, nil))))))))
    3
    >>> calc_eval(Pair("//", Pair(100, Pair(Pair("+", Pair(2, Pair(3, nil))), nil))))
    20
    """ 
    # BEGIN SOLUTION Q2

Use Ok to test your code:

python3 ok -q floor_div

Q3: New Form

Suppose we want to add handling for and expressions to our Calculator interpreter as well as introduce the Scheme boolean expressions #t and #f. These should work the same way they do in Scheme. (The examples below assumes conditional operators (e.g. <, >, =, etc) have already been implemented, but you do not have to worry about them for this question.)

calc> (and (= 1 1) 3)
3
calc> (and (+ 1 0) (< 1 0) (/ 1 0))
#f
calc> (and #f (+ 1 0))
#f
calc> (and 0 1 (+ 5 1)) ; 0 is truthy in Scheme!
6

In a typical call expression, we first evaluate the operator, then evaluate the operands, and finally apply the operator to the evaluated operands (just like you did for floor_div in the previous question). However, since and is a special form that short circuits on the first false-y operand, we cannot handle these expressions the same way we handle regular call expressions. We need to add special handling for combinations that don't evaluate all the operands. Complete the implementation below to handle and expressions.

Hint: Even though our calculator is interpreting Scheme expressions (with Pairs), we're performing evaluations in Python, which means we'll want to compare our calc_evaled expressions to the Python False, not the Scheme #f!

def eval_and(operands):
    """
    >>> calc_eval(Pair("and", Pair(1, nil)))
    1
    >>> calc_eval(Pair("and", Pair(False, Pair("1", nil))))
    False
    >>> calc_eval(Pair("and", Pair(1, Pair(Pair("//", Pair(5, Pair(2, nil))), nil))))
    2
    >>> calc_eval(Pair("and", Pair(Pair('+', Pair(1, Pair(1, nil))), Pair(3, nil))))
    3
    >>> calc_eval(Pair("and", Pair(Pair('-', Pair(1, Pair(0, nil))), Pair(Pair('/', Pair(5, Pair(2, nil))), nil))))
    2.5
    >>> calc_eval(Pair("and", Pair(0, Pair(1, nil))))
    1
    >>> calc_eval(Pair("and", nil))
    True
    """
    # BEGIN SOLUTION Q3

Use Ok to test your code:

python3 ok -q eval_and

Q4: Saving Values

In the last few questions we went through a lot of effort to add operations so we can do most arithmetic operations easily. However it's a real shame we can't store these values. So for this question let's implement a define special form that saves values to variable names. This should work like variable assignment in Scheme; this means that you should expect inputs of the form(define <variable_name> <value>) and these inputs should return the symbol corresponding to the variable name.

calc> (define a 1)
a
calc> a
1

This is a more involved change. Here are the 4 steps involved:

Add a bindings dictionary that will store the names and correspondings values of variables as key-value pairs of the dictionary.
Identify when the define form is given to calc_eval.
Allow variables to be looked up in calc_eval.
Write the function eval_define which should actually handle adding names and values to the bindings dictionary.

We've done step 1 for you. Now you'll do the remaining steps.

bindings = {}

def eval_define(expr):
    """
    >>> eval_define(Pair("a", Pair(1, nil)))
    'a'
    >>> eval_define(Pair("b", Pair(3, nil)))
    'b'
    >>> eval_define(Pair("c", Pair("a", nil)))
    'c'
    >>> calc_eval("c")
    1
    >>> calc_eval(Pair("define", Pair("d", Pair("//", nil))))
    'd'
    >>> calc_eval(Pair("d", Pair(4, Pair(2, nil))))
    2
    """
    # BEGIN SOLUTION Q4

Use Ok to test your code:

python3 ok -q eval_define

Check Your Score Locally

You can locally check your score on each question of this assignment by running

python3 ok --score

This does NOT submit the assignment! When you are satisfied with your score, submit the assignment to Gradescope to receive credit for it.

Submit

Make sure to submit this assignment by uploading any files you've edited to the appropriate Gradescope assignment. For a refresher on how to do this, refer to Lab 00.

Optional Questions

These questions are optional, but you must complete them in order to be checked off before the end of the lab period. They are also useful practice!

Scheme Data Abstraction

Q5: Make Even

We've solved this problem with the Tree class. Now let's try it in Scheme!

Define a procedure make_even which takes in a tree t whose values are integers, and returns a new tree such that all the odd integers are increased by 1 and all the even integers remain the same.

(define (make-even t)
    'YOUR-CODE-HERE
)

Use Ok to test your code:

python3 ok -q make-even

Q6: Find

Implement find which takes in a tree t whose values are integers and returns whether there exists any instances of integer x within the tree.

Hint: It may feel natural to use Python's all or any operator in this solution, but those operators unfortunately do not exist in scheme. For 7 and 8, is there some clever way we can emulate all andany using Scheme's filter and length operators?

(define (find t x)
    'YOUR-CODE-HERE
)

Use Ok to test your code:

python3 ok -q find

Q7: N-ary Tree

We will define an N-ary tree to be a tree where each non-leaf subtree or node has exactly n branches. Implement a procedure n-ary which returns whether a tree t is an N-ary tree.

Note: This N-ary definition is contrived. In computer science, the actual definition of an N-ary tree is a tree where a node has any arbitrary number of branches - aka the type of trees we usually work with!

(define (n-ary t n)
    'YOUR-CODE-HERE
)

Use Ok to test your code:

python3 ok -q n-ary