This assignment again involves using fannExplorer, which can be accessed with a web browser at
http://ilinux1.eecs.berkeley.edu:2718/fannExplorer.html.
In this assignment, we will experiment with a standard backpropagation
example, the auto-encoder. (Yes, it has issues, and we apologize.)
If you are running FANN at home, you can get the necessary data files here.
The basic idea of this assignment is very simple: get fannExplorer to
produce output that is as close as possible to its input. As before, we
will restrict ourselves to binary strings. The catch is that the network will
have a hidden layer that is significantly smaller than the input and output
layers.
The basic case will have 4 binary input units, 2 hidden units and 4 output
units. This is called the 4-2-4 encoder. Only one input unit will be
"on" (set to 1) at a time. We want the network to learn weights that
will cause the corresponding output unit to turn on. That is, the network
should learn to produce output that is exactly the same as its input, with only
1 neuron on; but it must encode this using only the 2 neurons in the hidden
layer.
We can envision the network as doing a kind of compact encoding. For example,
if some language had only 4 phonemes, the auditory system could get by with just
2 fibers for transmitting one phoneme at a time. More realistically, a phoneme
from a language with 64 phonemes could be transmitted by just 6 nerve fibers
from one brain region to another. One could imagine a complex neural structure
that computed which phoneme was most likely at each moment and another complex
structure that made use of phonemes to make up words. Since each phoneme has
different uses, we would need a separate unit for each one at the receiving end,
but the transmission could be done more compactly using the idea above.
The assignment is to experiment with how well backpropagation learning can do
at finding weights that will produce a good encoding.
The first part of this assignment is to analyze how the system does on the 4-2-4
encoder problem.
In fannExplorer,
create a network with 4 inputs, 4 outputs, and 1 hidden layer with 2 nodes in it;
load 424.train and 424.test.
You may want to look at the data (both training and test data are identical) to
see what the desired patterns look like. (Remember that the output should match
the input, and only one neuron should be on.)
Now experiment
with different values for the learning rate and momentum, using Incremental Training.
Don't forget
random initialization. Again, please don't train for longer than necessary
so that the server does not get bogged down.
- Explain how the parameter values affected the final error rate and number
of trials needed.
- Notice that separate runs might have wildly different outcomes, since the
initial weights are random. How does this relate to the way different animals behave differently?
- How *might* the network's hidden layer encode the input? (hint: what do you know about binary representations?)
Now expand the task to solve the 8-3-8 encoder problem.
(Create a network with 8 inputs, 8 outputs, and 1 hidden layer with 3 nodes in it;
load 838.train and 838.test.)
- Experiment with and report on the effects of different learning rate and momentum values.
- Do you notice any differences in moving to a larger task?
Finally, try fannExplorer on the 9-3-9 encoder problem.
(Create a network with 9 inputs, 9 outputs, and 1 hidden layer with 3 nodes in it;
load 939.train and 939.test.)
- Experiment with and report on the effects of different learning rate and momentum values.
- Why can it learn this despite the fact that 3 bits is not enough to encode
9 values?
- What pattern of activation accomplishes the mapping from
(1,0,0,0,0,0,0,0,0) to (1,0,0,0,0,0,0,0,0) in your network?
- Is this the only pattern that could give you the right result, in this
network?
- Would your answer to the previous question be different if the hidden
layer was much larger?
FannExplorer uses the backpropagation algorithm that we studied in class. The goal of
this problem is to produce in a hand simulation of one step in the learning of
the 4-2-4 encoder.
- Pick a time about half way through the training and copy (approximate)
values for the weights learned up to that time.
- Compute the output values for the next teaching input and compare these
with the appropriate training values.
- Illustrate the calculations that fannExplorer uses to update the weights for one iteration
with one test input. Don't forget the momentum factor (show where in the calculations the momentum factor would be applied, though of course the momentum factor will be 0 in this case because you have no previous learning step to provide momentum).
The assigned reading has further discussion as well as numerical values that may
be of use.
- In Problem 4, you calculated a lot of numbers. What, if anything, do these
numbers mean in neural terms?
- What aspects of real neural systems are mapped by PDP systems?
- What aspects of PDP systems do NOT correspond to neural systems?
- Imagine a PDP system with the following structure. You have 7 input units
(i1, i2, i3, i4, i5, i6, i7), an arbitrarily large hidden layer, and two
output units (o1, o2). Two input units will be "on"; (set to 1) at
any given time.
If the two active input nodes are next to each other, o1 should fire.
Otherwise, o2 should fire. Thus, if i3 and i4 are active, then o1 should be
active. If i3 and i5 are active, o2 should be active.
Can the system learn this?
- Assume the system can, and has, learned this. Now imagine moving the input
nodes around. The connections and weights are all still the same, but now
the spatial ordering of the input units is (i4, i2, i7, i1, i3, i6, i5).
So, the weights between numbered units are the same (o1 still has the same
input weight from i1, and so forth), but the desired outputs have been
altered (for instance, we want o1 to fire when i4 and i2 are active,
because they are now adjacent).
Does the response of the output units to the input change at all?
- Does the response of the outputs tell you anything about the spatial
ordering of the input nodes?
- How does this compare to human neural systems?