Description of dbindel's MJ interpreter, Sep 2005
---

I wrote this MiniJava interpreter this summer, before the revised
versions of the assignments were written.  Consequently, I may have
used slightly different internal representations from the
representations you will be asked to use; for example, my lexical
analyzer only produces line numbers rather than line and column
numbers (when was the last time you looked at a column number in an
error message?).  Also, I didn't realize at the time that MJ comments
could be nested.  There will also be times when my lexer may be
generous about errors (e.g. allowing invalid characters); those errors
will be caught at parse time.  I *think* my ASTs will look very much
like the ASTs requested in the appropriate assignment, though, and the
interpreter output should certainly match the MiniJava spec.

To get started quickly, load the interpreter FASL file for your platform,
e.g.

  (load "mj.fasl")

The FASL files for different platforms may be found in subdirectories
under the build tree.  Now you can run a MJ program foo.java like this:

  (run-java "foo.java")

The "test" subdirectory contains examples taken from Appel's web site.
For example, you can run the Factorial.java program,

  (run-java "Factorial.java")

if you have a burning curiosity about the value of 10!.  The MiniJava
interpreter consists of tokenization, parsing, and interpretation phases;
the public interfaces to each phase are described below.


A note on FASL files
---

Many Lisps have fast-load file formats which are simple enough that they
can be concatenated.  For example, the mj.fasl file consists of the
concatenation of util.fasl, make-ll1.fasl, simple-lex.fasl,
simple-parse.fasl, and simple-interp.fasl.  Another way to load all the
FASL files would be one-by-one (in the correct order) using a loader file:

  (load "mj-dsb.lisp")

However, I would recommend using the all-in-one option.


Tokenizer interface (simple-lex):
---

I represent the tokenizer as a Common Lisp struct, which looks like

(defstruct token-stream
  (line nil)        ; List of characters in this line
  (line-number 0)   ; Current line number
  (stream nil)      ; Stream value
  (lookahead nil))  ; Lookahead token

The tokenizer processes input incrementally, which has the great
advantage that you can enter your program line-by-line through stdin,
and find out as you go whether you've made an error.  At the end of
file, I read the symbol eof.

The interface functions for the lexer are:

(test-tokenize-file fname) --
  Tokenizes the file named by the string fname, and returns a list of
token representations.

(open-token-stream fname) --
  Returns a newly-initialized token-stream, attached to an input stream
for the file named by fname.

(close-token-stream ts) --
  Closes an open token string.

(read-token ts) --
  Reads a single token from the input

(lookahead-token ts) --
  Looks ahead one token into the input.

(lookahead-token-type ts) --
  Returns a type identifier for the next token (e.g. num, id, ...)


Parser (simple-parse):
---

The parser takes an MJ program file and produces an abstract syntax tree
with a format similar to the one described in Appel.  The parser requires
the simple-lex lexer and the file make-ll1 (my home-brewed LL(1) parser
generator).

The interface functions for the parser are:

(parse-java fname) --
  Parses the Java file fname and returns an abstract syntax tree.

(print-java fname) --
  Pretty-prints the AST for the named file using Lisp's pprint.

(java-to-ast basename) --
  Pretty-prints the AST for basename.java to basename-AST.lisp, to 
make it easier to look at the parser output and to debug the parser and
the interpreter separately.  The file basename-AST.lisp saves the AST
into a global variable called *ast*.


Interpreter (simple-interp):
---

The interpreter is probably the piece that you care about most; it
interprets an MJ program (oddly enough).  The interpreter requires
simple-parse.

The interface functions for the interpreter are:

(run-java fname) --
  Runs the named MiniJava file.

(mj-run ast) --
  Runs the MiniJava program represented by the given abstract syntax tree.