Scheme Interpreter Extensions

There are many ways to extend our Scheme interpreter. A few suggestions are below, and the Wikipedia page for Scheme is a good place to start to learn more about the language itself.

We recommend that you submit the final version of your project before attempting any of these extensions. We are not responsible if you break the required components of your interpreter and Scheme code!

Variable Arguments

Many Scheme implementations allow procedures to take in a variable number of arguments, which get placed into a list before being bound to a parameter. (Compare to Python, where variable arguments are placed into a tuple.) This is specified by preceding the last parameter with a dot (much like Python's *):

    scm> (define (add . args) (apply + args))
    add
    scm> (add 3 7 2 1)
    13
  

Recall that a dot is used to specifiy the second component of a pair. Thus, formal lists can now be ill-formed.

In order to implement this, you will need to change the following:

Quasiquote and Unquote

Quoting prevents the interpreter from evaluating an expression. Often times, we might want to evaluate part of an expression but not the rest. For example, the following constructs a list containing the symbol a and its value:

    scm> (define a 3)
    a
    scm> `(a , a)
    (a 3)
  

The backquote (`) specifies a quasiquote, which can evaluate parts of an expression. A comma (,) is an unquote, which specifies that the next expression should be evaluated. Finally, a comma followed by an at symbol (,@) is an unquote-splicing, meaning that it evaluates the next expression, which must evaluate to a list, and then splices that list into the result:

    scm> `(a ,@ '(1 2 3) 4)
    (a 1 2 3 4)
  

Quasiquotes can be nested, and an unquoted expression should only be evaluated if it is at the same nesting level as the outermost quasiquote. The nesting level increases by one in each quasiquotation and decreases by one in each unquote/unquote-splicing.

Here are some examples from the Scheme R5RS reference manual:

    scm> `(list  ,(+ 1 2)  4)
    (list 3 4)

    scm> (let ((name 'a)) `(list ,name ',name))
    (list a (quote a))

    scm> `(( foo ,(- 10 3)) ,@(cdr '(c)) . ,(car '(cons)))
    ((foo 7) . cons)

    scm> `(a `(b ,(+ 1 2)  ,(foo ,(+ 1 3)  d)  e)  f)
    (a (quasiquote (b (unquote (+ 1 2)) (unquote (foo 4 d)) e)) f)

    scm> (let ((name1 'x)
               (name2 'y))
           `(a `(b ,,name1  ,',name2 d)  e))
    (a (quasiquote (b (unquote x) (unquote (quote y)) d)) e)
  

The tokenizer already handles quasiquotes and unquotes. However, you will need to modify scheme_read to handle them as well, as you did for normal quoting. Use the strings "quasiquote", "unquote", and "unquote-splicing", respectively.

In addition, the following changes are required:

Macro Definition

Macros allow the language itself to be extended by the user. Simple macros can be provided with the define-macro special form. This must be used like a function definition, and it creates a procedure just like define. However, this procedure has a special evaluation rule: it is applied to its arguments without first evaluating them. Then the result of this application is evaluated.

Here is a simple example:

    scm> (define-macro (when test . branch)
           (list 'if test (cons 'begin branch)))
    when
    scm> (when (< 3 4)
            (print 1)
            (print 2))
    1
    2
    okay
  

The code above defines a macro when that evaluates all the expressions in its body when the predicate expression is true. (You'll need to have implemented variable argument lists for this particular example to work.)

In order to implement define-macro, introduce a new subtype (say Macro) of Procedure or LambdaProcedure to represent them. Add another do_*_form function to handle macro definitions. To avoid duplicating code, you can add an optional parameter to do_define_form that indicates that a macro is being defined, with a default value of False. If this parameter is true, raise an error if function definition shorthand is not used, and create a Macro.

Finally, you'll have to work out how to define your new Macro so that (1) it doesn't evaluate its arguments and (2) the result that it computes is treated as a program and evaluated in the environment of the call.

Mutation

Like Python, Scheme has non-local assignment. In particular, the set! special form takes in a name and another expression and rebinds that name to the value of the expression in the first frame in which the name exists. Unlike Python, this frame can be the local or global frame.

In order to implement set!, add a method to Frame that rebinds a name to a new value in the first frame in which the name is found. An error should be raised if the name does not exist in any frame. You will also need to add a do_set_form function and a case for set! in scheme_eval.

Pairs can be mutated using set-car! and set-cdr! These can be easily implemented as primitive procedures in scheme_primitives.py.

If you are really ambitious, you might try getting set! to work when the target variable is a call-by-name parameter. This should only be allowed when the actual parameter is a simple variable. This takes a bit of doing (one reason that call-by-name didn't catch on in other programming languages after Algol). First, the interpreter will have to look at the prior value of the target variable before assigning to it, which is not otherwise necessary. You'll have to enhance your Macro class so that it supports an assignment operation when the actual parameter is a variable. It's all quite a bit of work, so you might instead simply choose to have assignment to a by-name parameter convert that parameter to an ordinary value, or to cause an error.

Library Code

Many standard Scheme procedures can be implemented in Scheme itself, as library code. Add a mechanism to your interpreter to load up a library file on startup (e.g. scheme_lib.scm). Then provide useful procedures in scheme_lib.scm, such as map, filter, and c*r variants up to four applications of car or cdr.

Error Handling

Currently, when an error occurs while attempting to evaluate an expression, the interpreter only prints out an error message, with no backtrace. This makes it difficult to determine the source of an error.

In order to provide a useful backtrace, start by adding names to primitive procedures and procedures defined using the special define syntax. Use default names, such as [lambda], for procedures with unknown names.

Now write a new function to handle an exception and call it from the first except clause in read_eval_print_loop. A Python exception contains information about every frame between the one that raised the exception and the one that handled it. If e is an exception, then e.__traceback__ is a traceback object that contains this information. A traceback is a recursive list of frames. Read more about traceback, frame, and code objects here.

A Python exception contains information at the Python level, but a user is interested in information at the Scheme level. So you should translate the Python-level information to Scheme-level information by extracting the latter from a frame. You can read the local variables in a frame, and you can obtain its associated code object to get the name of the Python function for that frame.

Some suggestions on what to do with a Python frame:

Here are some sample traces without the tail recursion optimization:

    scm> (define (foo x) (/ 1 x))
    foo
    scm> (define (bar x) (foo x) 3)
    bar
    scm> (define (baz x) (if (= x 0) (bar x) (baz (- x 1))))
    baz
    scm> (foo 0)
    Traceback (most recent call last):
      0     (foo 0)
      1     (/ 1 0)
    Error: division by zero
    scm> (bar 0)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
    scm> (baz 3)
    Traceback (most recent call last):
      0     (baz 3)
      1     (baz 2)
      2     (baz 1)
      3     (baz 0)
      4     (bar 0)
      5     (#begin (foo x) 3)
      6     (foo 0)
      7     (/ 1 0)
    Error: division by zero
  

With the tail recursion optimization:

    scm> (foo 0)
    Traceback (most recent call last):
      0     (foo 0)
      1     (/ 1 0)
    Error: division by zero
    scm> (bar 0)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
    scm> (baz 3)
    Traceback (most recent call last):
      0     (bar 0)
      1     (#begin (foo x) 3)
      2     (foo 0)
      3     (/ 1 0)
    Error: division by zero
  

Further Extensions

Feel free to implement any other features of Scheme that you want. You can read the full reference manual here. Examples include named lets, let*, letrec, do loops, strings, and vectors. (If you really want a challenge, then try to implement call-with-current-continuation, which isn't even handled correctly by STk.) How close can you get to what STk provides?