In this assignment, you will explore the capabilities of some very simple neural networks
using the tlearn software package.
See our Computing Resources page for information
about downloading the Tlearn software for use at home, or the
Instructional Computing Tlearn help
page for how to use Tlearn in the Soda clusters. The assigment description below assumes that you
will use a version of Tlearn with a GUI—i.e., xtlearn or the Windows or Mac version—though
everything can be done with "vanilla" Tlearn alone. However, NB, we recommend against using the
Windows version: Tlearn used with WinXP subsequent to the Service Pack 2 upgrade has been known
to generate mysterious bugs.
It's fine to run xtlearn remotely from a Windows machine, i.e. through a
secure shell, but you need an
x-windows client in order to
view the graphics.
(Here are some
directions for configuring SSH to run with the x-windows client Exceed.)
Refer to R7 in your reader (Plunkett and Elman: Ch. 1 and Appendix B) for general instructions on how to
run Tlearn, or to the Tlearn manual.
The first task is the logical AND function:
AND(0,0)=AND(0,1)=AND(1,0)=0; AND(1,1)=1. This function is also shown in the table below:
Input 1 |
Input 2 |
Output |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
1 |
1 |
- Start the tlearn application. Inside the (local) Tlearn folder,
open tlearn.
- Get the project files. Either download the files here: and.data,
and.teach, and.cf; or copy them from
cs182/public_html/sp06/a2-data/and.cf, etc. They should live in your local Tlearn directory.
- Open the AND project. If using the Windows or Mac version:
From the "Network" menu, use "New Project" to start a project called "and".
The files and.cf, and.data, and and.teach should open automatically as part of the project. If using
xtlearn, use the "set project" command from the "File" menu.
(For somewhat obscure reasons, you may have to do this twice.)
- Examine the network architecture and activation displays.
Do this by selecting the appropriate items in the "Display" menu.
- Set the network to compute the AND function shown above. Find an
appropriate setting of the weights and the activation output function such
that the 4 possible AND inputs produce the correct distribution of the AND
outputs. You can make all the necessary changes by modifying the and.cf
file.):
- Keep in mind that the bias node can act as a threshold for the output
units; assign its weight accordingly.
- Fix the weights at the values you think will work best. To set a fixed
weight (one that will not change), use the fixed notation for the
connection and specify the desired weight as the minimum and maximum.
For instance, the following line fixes the specified weight at 1:
1 from i2 = 1 & 1 fixed
With smaller weights, you may need to use minimum and
maximum weights that differ slightly:
1 from i2 = 0.6 & 0.7
fixed
- Note that this is also how you fix the weight from the bias node
(i.e., any connections from node 0).
- Try using both the sigmoid (default) or linear output function of
propagating activation. To use a sigmoid activation function, you can
simply remove lines in the SPECIAL section specifying any linear nodes.
-
Initialize the network. Set the number of training sweeps (in
"Training Options") to 1, and then run the "Train the
Network" command. (Tlearn requires a dummy training run for no good
reason.) You may have to do this twice, especially if you make changes
to the configuration file.
- Check the network output. Run the "Verify the Network has
learned" command to evaluate your solution for the AND function.
You can also check this on a per-pattern basis by cycling through the
patterns in the Node Activation display.
To hand in:
- Show the network architecture (weights, output function)
you designed and its output on the 4 AND input patterns. A screen-shot of the network architecture with weights indicated is recommended, though a .cf file and associated written expaination of what it looks like will be accepted if a screen-shot isn't possible for you.
- Show how the network
works, by illustrating the calculations that produce the output for each
pattern.
Now try to make tlearn learn the AND
function. Instead of defining the network by hand, you will set the system with
random initial weights that will be adjusted during training.
Although you probably don't know yet how exactly the program learns
these weights, you will by the end of next week: the algorithm is called
"back-propagation," and a short, non-mathematical description can be found
here.
2a. Learning without hidden nodes
- Reconfigure the files for learning.
- "Unfix" the weights. Remove any fixed settings and
ranges from the and.cf file. You can set the weight range from -1 to 1
by adding weight_limit = 1.0 in the SPECIAL: section of the and.cf file.
This means that unless otherwise specified, weights will be randomly
initialized between -1 and 1.
- Set the number of training sweeps to 5000.
- Set the output function to sigmoid. You can do this by removing
the line assigning a linear function if it is present in the and.cf
file.)
- Train the network. When you are done configuring the files, open the
"Error" display and run the "Train the Network" command.
- Experiment with training options. tlearn allows you to change parameters
like the learning rate, momentum, etc. (in "Training Options").
See if you can find settings that will enable the network to learn AND
essentially every time (within a fixed number of training runs). Keep a
record of the parameters you try and how they affect your results. Note what
kind of weights are learned by the network.
A few notes:
- You will need to choose a consistent criterion for what it means to learn in
this network. You might choose something based on the error determined by tlearn
(see below); this would mean that the error falls below a particular threshold
each time. Alternatively, you could require that each pattern produces the right
output within some constant amount.
- For whatever error criterion you choose, you should require that the network
satisfy it within a fixed number of training sweeps (like 10,000). Thus, failing
to learn means not meeting that criterion within that many sweeps.
- It may be helpful to examine the files produced by the system during training.
Files with the .wts extension, for instance, contain the weights (see Appendix B
for a description), and the .err file contains a record of error (if you select
the option on training or testing that lets you log error). As explained in the
Tlearn User manual, this error is actually the RMS or Root Mean Squared error,
which is the square root of the average (over all patterns) of the squared
error. Note that if you square this number and then multiply the result by the
number of patterns, you get something closer to the Sum of Squared Error measure
that's been mentioned in class.
To hand in:
- Briefly describe the learning criterion you used.
- Turn in a record of the parameters that you tried as well
as an account of what happened. Include an example solution.
- For what range of settings does the network reliably learn the AND
function?
- For what range of settings does the network learn about 75% of the time?
(That is, for about 75% of your training runs with new initial weights.)
(Hint: you may want to look for correlations between the randomly
initialized weights you get and the resulting learning behavior.)
2b. Learning with hidden nodes
- Reconfigure the network architecture to add two hidden nodes. Try
learning the same AND function starting with a net that has two intermediate
units between the input and output, but no direct input/output links. You
will need to edit the and.cf file to add the nodes, making sure to
add/change the relevant connections. Specifically, both inputs should be
connected to both hidden nodes, and both hidden nodes should be connected to
the output node. The bias node should be connected to both hidden nodes and
the output node.
- Experiment with adjusting the same parameters.
To hand in:
- Once again, turn in a record of the parameters that you
tried, a general account of what happened, and an example solution.
- How much does this new network help?
The second task is the logical SAME function: SAME(0,0)= SAME (1,1)=1;
SAME(0,1)= SAME(1,0)=0. This function is also shown in the table below:
Input 1 |
Input 2 |
Output |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
1 |
1 |
- Using the AND project as an example, create the necessary training and
configuration files for the SAME project (or simply edit the .data and .teach
files to perform the SAME function -- the architecture will be the same as
for AND).
- Try to repeat the experiments involving (a) designed weights as in Part 1
and (b) learning as in Part 2, without and with two extra hidden units.
To hand in:
- Turn in a record of the parameters that you tried, a
general account of what happened, and an example solution.
- How do the results differ in this case? Explain why.