Implementing backpropagation:
For this part you will complete a simple neural network with two input nodes and a single output node.
For non-Java instructions, see below.
The provided code already sets up a three-node net for you,
however you are tasked with supporting feedforward activation and backpropagation learning. You should make the following assumptions:
- Input and output patterns will be contained in an
int[][] (an array of patterns,
each of which is an array of integers).
- The output unit uses the sigmoid activation function and should also have a bias.
- The network has some training parameters that you may wish to vary during
experimentation as appropriate. Be sure to include both learningRate and momentum in your weight updates.
- Weights should be updated on a per-pattern basis (i.e., you needn't implement batch learning).
-
Copy the starter code, particularly including Net.java, Unit.java,
TesterPart1.java, and
runTest.sh.
Take some time to examine the
Net , Unit and TesterPart1 classes
they define respectively. Each contains a number of methods, most of which
are marked as ** to be filled in ** .
You are required to implement the methods in Net.java and Unit.java .
The third file, TesterPart1.java , is provided to help you test and debug the project, as well
as producing the output for the autograder.
Note: all code must work with Java 1.5 (i.e. don't use new stuff from version 6.)
- Study the provided code and understand how the network and data structures are being used.
Net.java already creates and links 4 units in the appropriate fashion: 2 input nodes,
an output node, and a bias node. (Click on the figure to enlarge.)
When testing how your network learns, the autograder only makes 2 functional calls to
your code.
The first is Net n = new Net(data, targets); to create the network,
followed by n.train(); to train the network.
The functions setTrainingParameters(...) , logNetwork() , logWeights() , logActivationCalculation() , and logWeightUpdates() may
also be called, but these are already written for you.
Notice that logWeights() and logWeightUpdates assume that outUnit.inWeights[0] is the weight from inUnit1 to
outUnit, while outUnit.inWeights[2] is the weight from the bias to outUnit.
|
- Implement the backpropagation algorithm.
There are 8 functions that need to be filled in. If you don't know where to start, try
completing the functions in the same order that the test file checks them:
in Unit.java:
-
initialize(): Randomize all incoming weights to values chosen uniformly between -1 and 1 (the values of Net.MIN_WEIGHT and Net.MAX_WEIGHT).
-
computeActivation(): Apply sigmoid function to weighted sum of inputs
-
computeError(): Compute error for the output node
-
computeWeightChange(): Calculate the current weight change
-
updateWeights(): Update changes to weights for this pattern
in Net.java:
-
feedforward(): Present pattern and compute activations for rest of net
-
computeError(): Present all patterns to network and calculate current error
-
train(): Train the net according to the current training parameters, and output important information as you train:- First, output the initial state of the network by calling logNetwork().
- Then, for the first 10 epochs (cycles through each training data point), for each data point you must call logActivationCalculation() and logWeightUpdates() to output the progress of training for your network. So, for 4 training data points, you should call these functions 40 times (4 for each of the first 10 epochs). That is: for the first 10 epochs you must print the network's state and change for every data point. Only 10 epochs are required because that is long enough to verify that your program performs the computations correctly.
This output will be used to grade your assignment, so be careful to get it right!
If you need to see the equations (from the slides in class and section), they are highlighted in the following set of slides: (pdf) (powerpoint)
- Test and Comment your Program. Make sure to comment your code clearly. If
your code is wrong and your comments are specific enough (don't write a book)
at least we will know your intent.
There are two easy ways to test your program after you have compiled all of your code ( % javac -g *.java ).
- Call TesterPart1 with
% java TesterPart1 to run a few very elementary tests on your various functions.
Although they can't guarantee your functions are correct, they will at least make sure you are not making a silly error.
If you fail one of the earlier tests, it will also cause you to fail later tests.
- Train your network with various parameters, making sure sure that changing the momentum,
learning rate, etc... has the expected effect. Make the call:
% java TesterPart1 training_file ne lr mom ec
where:
training_file is the training file containing the function to be learned (either "and.data", "or.data", "xor.data", or "same.data") -
of course xor and same won't be learned yet
ne is the number of epochs
lr is the learning rate
mom is the momentum
ec is the error criterion
for example (with arbitrary numbers for ne, lr, mom, ec):
% java TesterPart1 and.data 500 .1 .5 .1
What to submit for Part 1
You should submit your assignment using the submit program
available on the EECS instructional system; instructions for doing this can
be found here. This assignment is a3-1. Be sure to
include:
- Your completed code, including
Net.java and Unit.java .
Remember to submit any other code files you have created. Do
not submit the .class files.
-
The output (output.tgz) resulting from running the shell script
runTest.sh . This will run your program on and.data, or.data, xor.data, and same.data. It will test the network on each of these with 1000 epochs, 3 different learning rates (0.02, 0.1, and 0.5), and 3 different momentum settings (0.0, 0.5, and 0.9), with an error criterion of 0.1.
You call this script by running bash runTest.sh , and it will produce the output file output.tgz .
If you are not using Java, you must run the tests yourself and produce a tarball of the resulting output. You should name our output files [train]-[lr]-[mom].out, where [train] is the training set, [lr] is the learning rate, and [mom] is the momentum. You should create a tarball by running tar -czf output.tgz *.out .
- Answers to these questions, in a file called
answers.txt .
Make sure this includes the usual information.
- Under what conditions is a network with hidden layers computationally identical to one without any hidden layers?
- How would you change Part 1 of this assignment? was it too difficult? too easy?
- (If relevant) What kinds of problems did you encounter?
- Describe any design problems you may have had.
- If your network doesn't work, where is the problem?
- If you weren't able to get everything working or ran out of time, explain your
approach, what you know, and what you might do differently next time.
Note: If you cheat, we will catch you. Don't cheat!
Non-Java instructions
If you are not using Java, then your program must load the input files and produce output of the right sort. The format is as follows.
Input
Input files will look like this:
DATA_DESCRIPTION
[number of data points] [number of inputs per datum] [number of outputs per datum]
DATA
[input data separated by spaces] ; [output data separated by spaces]
[input data separated by spaces] ; [output data separated by spaces]
...
Output
Output files should look like this:
NETWORK
[max number of epochs] [learning rate] [momentum] [error cutoff]
WEIGHTS
[output neuron weights in order, with bias unit last, separated by spaces]
[then, for each datum for the first 10 epochs, you should output the following:]
WEIGHTS
[output neuron weights in order, with bias unit last, separated by spaces]
ACTIVATION
[activation of input neurons, separated by spaces]
[activation of output neuron]
MOMENTUM
[momentum terms for all weights of output neuron, separated by spaces]
WEIGHT CHANGE
[change in weights for all weights of output neuron, separated by spaces]
[...]
for example:
NETWORK
10000 0.1 0.1 0.1
WEIGHTS
0.5 0.5 0.5
WEIGHTS
0.5 0.5 0.5
ACTIVATION
1.0 1.0
0.8175744761936437
MOMENTUM
0.0 0.0 0.0
WEIGHT CHANGE
0.0027208119642790096 0.0027208119642790096 0.0027208119642790096
WEIGHTS
[...]
If you are not using Java, you must run the tests yourself (see above) and produce a tarball of the resulting output. You should name our output files [train]-[lr]-[mom].out, where [train] is the training set, [lr] is the learning rate, and [mom] is the momentum. You should create a tarball by running tar -czf output.tgz *.out .
Note: If you do not use Java, you may be asked to demo your program for grading. You also may not receive partial credit if your program does not produce output as it should. I recommend that you use Java!
|