*Due by 11:59pm on Thursday, 7/17*

**Readings:** You might find the following references
useful:

This is a review of all of the material you have seen so far, so there are no new readings. If you haven't already, you should do the readings for hierarchical data structures:

**Submission:** See the online submission
instructions. We have provided a hw6.py starter file for the questions
below.

In addition to the hw6.py starter file, you will also need the ucb.py file, although you will only need to edit and submit hw6.py.

We're going to build a calculator! It will use the same syntax as Scheme, the language that we will learn in a few weeks, but for now it will just be able to do basic arithmetic.

Once you are done, you can interact with the calculator language:

```
python3 hw6.py
minicalc> 1
1
minicalc> (+ 1 3)
4
```

To exit out of the Calculator language, type Ctrl-C or Ctrl-D.

*The goal of this homework is to review the material so far and
introduce you to the idea of interpretation, that is, a program which
can understand other programs.* This is of course not a very useful
program, because we could just use Python itself as a calculator.
However, it serves as a nice toy example of an interpreter and
introduces you to the ideas you will need in order to do project 4 (in
which we will build a interpreter for Scheme).

The input to Calculator is a program, in the form of a string. The
first step is to take this string and break it up into **tokens** -
this is done by `tokenize`

, which is written for you.

Then we **parse** those tokens to create an **AST** (abstract syntax
tree), which we represent using a deep linked list. This is the job of
`read_exp`

and `read_until_close`

. The purpose of this stage is to
expose the structure of the parentheses. For example, given the
tokens `['(', '+', '(', '-', '2', ')', '3', ')']`

, `read_exp`

will
make sure to put the `( - 2 )`

as a smaller linked list - this is how
the structure of the parentheses is shown in the AST.

Then, we **evaluate** the AST in order to get the answer, which is
then printed. This is done by `calc_eval`

, which uses `calc_apply`

.

The syntax of the Calculator language is the same as for the Scheme programming language. The key design principle for Scheme is simplicity and elegance. While Python has lots of syntax for many different cases, Scheme has just one form of syntax - the function call notation.

In Python, the function call notation is:

`function(arg1, arg2, ...)`

The only difference in Scheme is that the parenthesis comes earlier and there are no commas:

`(function arg1 arg2 ...)`

Most math expressions you've seen are in what's called *infix notation*,
where the operators appear between the numbers they act on. These
expressions look like `3 + 4`

or `27 / 3`

.

However, Scheme and Calculator both use *prefix notation*, where the
operators come before the expressions they act upon. Examples are ```
(+
3 4)
```

and `(/ 27 3)`

.

Of course, these expressions can be nested inside of each other. The following expression uses infix notation and has a nested expression:

`(12 / 3) * 5`

In the programs you'll be working with in this homework, the same expression would look like:

`(* (/ 12 3) 5)`

`scheme_exp`

which returns a string containing the
prefix notation for the expression `48/(2*(9+3))`

if n is 1, and the
prefix notation for `(48/2)*(9+3)`

otherwise.
```
def scheme_exp(n):
"""If n == 1, returns a Scheme program that computes 48/(2*(9+3))
Otherwise, returns a Scheme program that computes (48/2)*(9+3)
NOTE: These doctests use eval_string, which will not work until
you have completed the entire homework.
>>> first = scheme_exp(1)
>>> assert '/' in first and '*' in first and '+' in first
>>> assert '48' in first and '2' in first
>>> assert '9' in first and '3' in first
>>> eval_string(first)
2.0
>>> second = scheme_exp(2)
>>> assert '/' in second and '*' in second and '+' in second
>>> assert '48' in second and '2' in second
>>> assert '9' in second and '3' in second
>>> eval_string(second)
288.0
"""
if n == 1:
"*** YOUR CODE HERE ***"
else:
"*** YOUR CODE HERE ***"
```

The first step is to take the program (which will be a simple string),
and to **tokenize** it. This just means to identify the various
"things" in the program. In Calculator, the "things" are parentheses
(such as `'('`

and `')'`

), numbers (such as `'123'`

and `'74.2'`

), and
operators (such as `'+'`

and `'*'`

).

We will use a `tokenize`

function, which will take a program
(represented as a string) and return a list of tokens. Since this is
very similar to `extract_words`

from the Trends project, we have
filled it out for you:

```
def tokenize(s):
"""Splits the provided string into a list of tokens.
>>> tokenize('(* (+ 12 3) 5)')
['(', '*', '(', '+', '12', '3', ')', '5', ')']
"""
s = s.replace('(', ' ( ')
s = s.replace(')', ' ) ')
return s.split()
```

The next step is to take this list of tokens and turn it into a
structure that is easier to understand. This structure is called an
Abstract Syntax Tree (AST). In particular, in Calculator, the
parentheses show us the structure of the program. We want to take in
the list of tokens, and produce a **deep** linked list that shows the
structure of the program. Each subexpression in the Calculator
expression should correspond to a smaller linked list in the deep
linked list.

```
def numberize(atomic_exp):
"""Converts atomic_exp to a number if possible, otherwise leaves
it alone. atomic_exp is guaranteed to be a single token, and will
not be a parenthesis.
>>> numberize('123')
123
>>> numberize('3.14159')
3.14159
>>> numberize('+')
'+'
"""
# Consider doing int(atomic_exp) and float(atomic_exp).
# What does int('123') do? What does int('+') do?
"*** YOUR CODE HERE ***"
```

Write `read_exp`

, a function which parses the program for our
calculator as described above.

`read_exp`

takes in a list of tokens, and returns the first calculator
expression encountered along with the unevaluated tokens.

The first calculator expression could be a combination expression, an
operator, or simply a number. If the first expression is a
combination expression (that is, the first token is `'('`

), then it
returns a (possibly deep) linked list representing that expression.
If the first token represents a number, that is a single Calculator
expression and so it just returns that number. Note that tokens such
as `'2'`

are converted into numbers in the returned AST. If the first
token is an operator, it just returns that operator.

`read_exp`

also returns the rest of the tokens not included in the
first calculator expression (see doctests for examples). This will be
helpful for the recursion.

**To accomplish this task, read_exp mutually recurses with the
helper function read_until_close.** This means that when
implementing

`read_exp`

, you may want to use `read_until_close`

, and
when you implement `read_until_close`

, you may want to use `read_exp`

,
and so you should not start writing code until you understand what
`read_until_close`

also takes
in a list of tokens, but only reads up to and includes the first
mismatched closed parenthesis. `read_until_close`

returns a linked
list of all the expressions read up to that point, as well as the rest
of the tokens not read (the tokens following the first unmatched
parentheses).You've already been given some code in the beginning of
`read_exp`

. Read this code carefully to understand what constitutes an
illegal input to `read_exp`

. Notice the keyword `raise`

- this is how
we create our own errors. See the
documentation for
more details.

**Make sure you read and understand all the doctests for both
functions before you start writing code.**

```
def read_exp(tokens):
"""Given a list of tokens, returns the first calculator expression
(either a number, operator, or combination), and the rest of the
expression that has not been turned into an expression.
>>> exp, unevaled = read_exp(['(', '+', '2', '3', '6', ')'])
>>> print_linked_list(exp)
< '+' 2 3 6 >
>>> unevaled
[]
>>> exp, unevaled = read_exp(['2', '3'])
>>> exp
2
>>> unevaled
['3']
>>> exp, unevaled = read_exp(['(', '/', '6', '2', ')', '(', '-', '2', ')'])
>>> print_linked_list(exp)
< '/' 6 2 >
>>> unevaled
['(', '-', '2', ')']
>>> exp, unevaled = read_exp(['(','*','4','(','-','12','8',')',')','3','2'])
>>> print_linked_list(exp)
< '*' 4 < '-' 12 8 > >
>>> unevaled
['3', '2']
"""
if tokens == []:
raise SyntaxError('unexpected end of input')
token, rest = tokens[0], tokens[1:]
if token == ')':
raise SyntaxError('unexpected )')
elif token == '(':
if rest == []:
raise SyntaxError('mismatched parentheses')
elif rest[0] == ')':
raise SyntaxError('empty combination')
"*** YOUR CODE HERE ***"
else:
"*** YOUR CODE HERE ***"
def read_until_close(tokens):
"""Reads up to and including the first mismatched close
parenthesis, then forms a combination out all of the values read
up to that point.
>>> exp, unevaled = read_until_close(['+', '2', '3', ')', '4', ')'])
>>> print_linked_list(exp)
< '+' 2 3 >
>>> unevaled
['4', ')']
>>> exp, unevaled = read_until_close(['+', '(', '-', '2', ')', '3', ')', ')'])
>>> print_linked_list(exp)
< '+' < '-' 2 > 3 >
>>> unevaled
[')']
"""
"*** YOUR CODE HERE ***"
```

Now that we have our AST, we can turn to the problem of *evaluating*
the AST, that is, we figure out what the answer to the program is.

First, we need to understand `calc_apply`

. `calc_apply`

takes an
operator and a linked list of numbers. Note that the linked list is
**not** a deep linked list. `calc_apply`

applies the operator to
those numbers. It checks what the operator is, and based on that, it
calls a function that can do that specific operation.

```
def calc_apply(op, args):
"""Applies an operator to a linked list of arguments.
>>> calc_apply('+', link(12, link(34, empty)))
46
>>> calc_apply('-', link(10, link(2, link(3, link(4, empty)))))
1
>>> calc_apply('-', link(3, empty))
-3
>>> calc_apply('*', empty)
1
"""
if op == '+':
return do_addition(args)
elif op == '*':
return do_multiplication(args)
elif op == '-':
return do_subtraction(args)
elif op == '/':
return do_division(args)
else:
raise NameError('unknown operator {}'.format(op))
```

Now, we need to implement the functions that do specific
operations. Addition, multiplication, and subtraction have been
implemented for you. They use the `reduce_linked_list`

function which
is defined near the end of the starter file. Your job is to define
`do_division`

, which does the division operation.

When an operation `op`

is given more than two arguments, it is *left
associative*. For example,

`(op a b c d)`

would do that same thing as

`(op (op (op a b) c) d)`

So, `(/ 60 2 3 5)`

is equivalent to `(/ (/ (/ 60 2) 3) 5)`

which is
equivalent to `(/ 60 (* 2 3 5))`

```
def do_division(args):
"""Applies the division operation to args.
args must have a length of at least 1, as in do_subtraction.
>>> do_division(link(4, empty))
0.25
>>> do_division(link(7, link(2, empty)))
3.5
>>> do_division(link(60, link(2, link(3, link(5, empty)))))
2.0
"""
"*** YOUR CODE HERE ***"
```

`calc_apply`

, let's figure out how to use
`calc_apply`

. We'll call `calc_apply`

inside of
`calc_eval`

. `calc_eval`

takes in a calculator expression. A
calculator expression can either be a number or a linked list that can
contain multiple items where the first item is a operator and rest are
more calculator expressions. Note that this linked list Remember that we have two different types of calculator expressions -
numbers and function calls. For which type would we want to use
`calc_apply`

? Don't forget that `calc_apply`

takes in a linked list
of *numbers*, but the subexpressions of `exp`

can themselves be linked
lists - how could you turn these subexpressions into numbers? (Hint:
Trust the recursion! Domain and range will be helpful here.) You may
find the `map_linked_list`

function helpful. It is defined near the
end of the starter file.

The other type of expressions is just plain numbers. What should we do for these expressions?

```
def calc_eval(exp):
"""Evaluates a calculator expression.
>>> calc_eval(5)
5
>>> calc_eval(link('+', link(12, link(3, empty))))
15
>>> subexp1 = link('*', link(3, link(4, empty)))
>>> subexp2 = link('-', link(12, link(9, empty)))
>>> exp = link('+', link(subexp1, link(subexp2, empty)))
>>> print_linked_list(exp)
< '+' < '*' 3 4 > < '-' 12 9 > >
>>> calc_eval(exp)
15
"""
"*** YOUR CODE HERE ***"
```