Description of dbindel's MJ interpreter, Sep 2005 --- I wrote this MiniJava interpreter this summer, before the revised versions of the assignments were written. Consequently, I may have used slightly different internal representations from the representations you will be asked to use; for example, my lexical analyzer only produces line numbers rather than line and column numbers (when was the last time you looked at a column number in an error message?). Also, I didn't realize at the time that MJ comments could be nested. There will also be times when my lexer may be generous about errors (e.g. allowing invalid characters); those errors will be caught at parse time. I *think* my ASTs will look very much like the ASTs requested in the appropriate assignment, though, and the interpreter output should certainly match the MiniJava spec. To get started quickly, load the interpreter FASL file for your platform, e.g. (load "mj.fasl") The FASL files for different platforms may be found in subdirectories under the build tree. Now you can run a MJ program foo.java like this: (run-java "foo.java") The "test" subdirectory contains examples taken from Appel's web site. For example, you can run the Factorial.java program, (run-java "Factorial.java") if you have a burning curiosity about the value of 10!. The MiniJava interpreter consists of tokenization, parsing, and interpretation phases; the public interfaces to each phase are described below. A note on FASL files --- Many Lisps have fast-load file formats which are simple enough that they can be concatenated. For example, the mj.fasl file consists of the concatenation of util.fasl, make-ll1.fasl, simple-lex.fasl, simple-parse.fasl, and simple-interp.fasl. Another way to load all the FASL files would be one-by-one (in the correct order) using a loader file: (load "mj-dsb.lisp") However, I would recommend using the all-in-one option. Tokenizer interface (simple-lex): --- I represent the tokenizer as a Common Lisp struct, which looks like (defstruct token-stream (line nil) ; List of characters in this line (line-number 0) ; Current line number (stream nil) ; Stream value (lookahead nil)) ; Lookahead token The tokenizer processes input incrementally, which has the great advantage that you can enter your program line-by-line through stdin, and find out as you go whether you've made an error. At the end of file, I read the symbol eof. The interface functions for the lexer are: (test-tokenize-file fname) -- Tokenizes the file named by the string fname, and returns a list of token representations. (open-token-stream fname) -- Returns a newly-initialized token-stream, attached to an input stream for the file named by fname. (close-token-stream ts) -- Closes an open token string. (read-token ts) -- Reads a single token from the input (lookahead-token ts) -- Looks ahead one token into the input. (lookahead-token-type ts) -- Returns a type identifier for the next token (e.g. num, id, ...) Parser (simple-parse): --- The parser takes an MJ program file and produces an abstract syntax tree with a format similar to the one described in Appel. The parser requires the simple-lex lexer and the file make-ll1 (my home-brewed LL(1) parser generator). The interface functions for the parser are: (parse-java fname) -- Parses the Java file fname and returns an abstract syntax tree. (print-java fname) -- Pretty-prints the AST for the named file using Lisp's pprint. (java-to-ast basename) -- Pretty-prints the AST for basename.java to basename-AST.lisp, to make it easier to look at the parser output and to debug the parser and the interpreter separately. The file basename-AST.lisp saves the AST into a global variable called *ast*. Interpreter (simple-interp): --- The interpreter is probably the piece that you care about most; it interprets an MJ program (oddly enough). The interpreter requires simple-parse. The interface functions for the interpreter are: (run-java fname) -- Runs the named MiniJava file. (mj-run ast) -- Runs the MiniJava program represented by the given abstract syntax tree.