In this assignment, you will implement a system that learns right-regular
grammars using a simplified version of the model merging algorithm, in Java.
This involves several steps; make sure to read through the entire assignment
before you begin coding.
--> THE ASSIGNMENT: [PDF] <-- (right... open up the PDF file...)
The starter code: [.tar.gz] [.zip]
If you have a lot of trouble with them, post to the newsgroup.
We think the classes and methods are necessary,
but perhaps not sufficient for the assignment, so let us know.
- Please check back to this page for possible updates to the assignment.
- As noted in the assignment, you should show your algorithm working on some
sample data.
- The starter code includes some test data and an example run (with alpha = 5) on that data. If your solution is correct, it should produce outputs that are identical to the solution--that is, its output should be the same as a6-training-data.alpha5.txt.
- You should feel free to make up your own data set also. If you want, I can run the solution on that data so you can compare.
- Here is help based on previous reactions, including links to javadoc and a description of the tree data structure you will want to use.
- As usual, if you encounter problems in getting your algorithm to work, you
should try to identify where the problems arise. If you are having problems
debugging something, it's better to turn in a run with debugging print
statements that expose/describe the problem than to turn in something that
either can't run or runs without working.
You should submit your assignment using the submit
program
available on the EECS instructional system. Instructions for doing this can be
found here. Be sure to include all
files mentioned in the assignment (Java files, answers.txt
, test.txt
).
Reader: O2 (Bailey et
al.)
Some links that may be of interest:
Note again: If you cheat, we will catch you. Don't cheat!