CompSci-61B Lab 7a
Hash Tables With Linear Probing

Instructor: Prof. Robert Burns

Hash tables are implementations of the key-value dictionary interface whose algorithms for "add", "remove", "getValue", and "contains" are all O(1). This is accomplished by (almost) eliminating iteration to locate an entry. Instead, each entry is assigned a unique location in an array, based on value of its key -- kind of like personal reserved spaces in a car parking lot. The procedure for converting a key value into an array index is called "hashing".

The practical problem of this approach is that there has to be lots of empty space in the array to accommodate the possible range of key values. But there are techniques such as "probing" and "chaining" that help to minimize this problem.

In this assignment you will use the compsci61b.MyDictionary interface to create 2 new classes: compsci61b.testing.MyHashTable and compsci61b.MyHashTableLP. The first is a simple hash table that does not account for "collisions" and is therefore not very useful -- but it allows us to be introduced to the concept of hashing without the complication of collision handling at the same time. The second is a full implementation of hashing, using linear probing to manage collisions. You will also do big oh analyses to determine how full (percentage-wise) a hash table has to be before its approximate O(1) behavior deteriorates appreciably.

Also in this assignment, you will be introduced to the protected keyword in Java, which is an alternative for private. The difference is that otherwise inaccessible class members become directly accessible to any sub-class. For example, is we had used this in MyArrayList, we could have accessed nValues directly in MySortedArrayList instead of using the "size" accessor.

One more Java construct that is introduced for the first time in our lab assignments is a constructor that has a parameter list.

This assignment is worth 40 points, and consists of 4 exercises. Zero points will be awarded for programs with misspelled filenames, or incorrect case. Zero points will be awarded for programs that do not compile or run on the department's UNIX server. Partial points will be awarded for programs that are run but do not conform to all specifications, at the discretion of the grader. Be sure to complete both exercises and post the lab7a.jar before midnight of the due date, because there is a 5-point-per-day lateness penalty for late work. Check the course outline for due dates for lab assignments.



EXERCISE 1: A "Perfect" Hash Table (10 points)
Write MyHashTable.java in package compsci61b.testing, using compsci61b.MyArrayDictionary as a model, with an array for data storage and an int to track size. Extend nothing and implement compsci61b.MyDictionary<K, V> and java.io.Serializable.

Make the otherwise private members protected, except for the ones associated with iterators. There is no need for sub-classes to access the iterators directly, so don't bother to make them protected. This will have no effect on this class, but it will be meaningful in the following exercise.

Make the inner class (Entry) and all of its members public, so that this inner class can be accessed directly by classes that extend its owner class, even though they might be in a diffent package. If you were to make these protected instead, they would be accessible only by extending classes in the same package. This is not a problem with the data, accessor, and mutator methods, so they can be protected.

Rename private static final int INITIAL_CAPACITY to something that refers to default size instead, because there will be no array expansion in this implementation. (But there will be in the next exercise’s implementation.) The array elements are to be references to "Entry" objects, encapsulating key/value pairs. You may have constructors, accessors, and mutators in your Entry class, or not -- it's your choice.

Note that the array size never changes, so it is possible for the data structure to become full. It will also be possible for there to be "holes" in the array. A hole is represented as a null reference to an Entry object -- the array values are all initially null.

Follow these specs for implementing the methods:

In the main test method, create a "compsci61b.MyDictionary" reference to a "compsci61b.testing.MyHashTable" object, and exercise all of its methods. If you use integer keys and string values, you can create some sample student records with numeric IDs as keys and names as values. Or you can use string keys and integer values in a phonebook-style application. Makeup a few records so store and retrieve, etc -- enough to convince yourself that the class works correctly. But note that collisions will not be handled, and that an attempt to add a value after the 1st value may fail. In your testing, predict collisions and see if you class behaves properly -- do not just add values and use you class to tell you if a collision happened. You test your code -- don't let it test you!

For public interface validation, use the same test.java as in lab 2a's exercise 2, but use these statements to create and validate the data structure object:

compsci61b.MyDictionary<Integer, String> a = new compsci61b.testing.MyHashTable<Integer, String>(100);
...
System.out.println("Should be false: " + (a instanceof compsci61b.MyList));
System.out.println("Should be true: " + (a instanceof compsci61b.MyDictionary));
System.out.println("Should be false: " + (a instanceof compsci61b.MyArrayBasedDataStructure));
System.out.println("Should be true: " + (a instanceof compsci61b.testing.MyHashTable));
System.out.println("Should be true: " + (a instanceof java.io.Serializable));
Compile, jar as lab7a.jar, and run, but do not submit -- the program will be further developed in the following exercises.

EXERCISE 2: A Hash Table With Linear Probing (10 points)
Write MyHashTableLP.java in package compsci61b, extending compsci61b.testing.MyHashTable. Change the methods to implement "linear probing". Here are the specs:

For public interface validation, use the same test.java as in lab 2a's exercise 2, but use these statements to create and validate the data structure object:

compsci61b.MyDictionary<Integer, String> a = new compsci61b.MyHashTableLP<Integer, String>(100);
...
System.out.println("Should be false: " + (a instanceof compsci61b.MyList));
System.out.println("Should be true: " + (a instanceof compsci61b.MyDictionary));
System.out.println("Should be false: " + (a instanceof compsci61b.MyArrayBasedDataStructure));
System.out.println("Should be true: " + (a instanceof compsci61b.testing.MyHashTable));
System.out.println("Should be true: " + (a instanceof java.io.Serializable));
Use the same "main" testing that you applied to the hash table from the previous exercise. Compile, jar as lab7a.jar, and run, but do not submit -- the program will be further developed in the following exercises.

EXERCISE 3: Solve The Game Of Life (10 points)
Write compsci61b.testing.GameOfLife.java, applying your compsci61b.MyHashTableLP class to implement a simulation of the classic "Game Of Life". Click here for the solution to the Game Of Life, using a hash table. It's incomplete, so you'll have some work to do here -- just follow the pseudocode and the instruction comments. Click here for an online simulation of the Game Of Life.

Here are some additional rules:

To test your Game Of Life simulation, try these input sequences:

1 1  2 2  3 3  3 4  3 5  -1 -1

1 1  2 2  3 3  4 4  3 4  4 3  5 2  1 4  -1 -1

(If you require the user to enter one value per typed line of input, tell him so!)

Also test your own patterns, comparing to the online simulation referred to above, to make sure your simulation gives the same results.

Compile, jar as lab7a.jar, and run, but do not submit -- the program will be further developed in the following exercise.



EXERCISE 4: Timing Studies For Linear Probing (10 points)
Write compsci61b.testing.LinearProbingTimingTests.java, similar to lab 4's compsci61b.testing.TimingTests.java. Prove that getValue is approximately O(1). You may fill the hash table with randomly-generated Double-Double pairs, because it is easiest to do that. Prompt the user to enter the number of entries for the first iteration (where norm = 1.00), as you have done in past timing tests. It will require repetitions that start and stop the timer and acculumate the elapsed time, because the operation is so fast. In each repetition, generate an different Double key to search for -- it will usually not be found, which is the worst case.

Perform timing tests on each to confirm your determination of the big oh. Use the spec from lab 4's exercises 5-8, with nicely-aligned columns and 2 decimal digits in the last column's values. There may be a slight rising trend in the normalized factor, due to the imperfectness of the big oh for probing.

unsuccessful getValue from a compsci61b.MyHashTableLP, approx O(1)
    n     big oh   elapsed time in millis   elapsed time / big oh  norm
-----  ---------   ----------------------   ---------------------  ----
10000          ?                        ?                       ?  1.00
20000          ?                        ?                       ?     ?
40000          ?                        ?                       ?     ?
80000          ?                        ?                       ?     ?
Compile and run. Create your final version of lab7a.jar, with files from this and all previous labs. Post your lab7a.jar file to your student UNIX account for grading and credit.

[ Home | Contact Prof. Burns ]