Lab 13: Hashing Part 2

A. Intro

Download the code for Lab 13 and create a new Eclipse project out of it.

Learning Goals

Today's exercises almost all focus on hashing of non-strings, to counter student misconceptions that hashing is used only with strings or even integers. In particular, there are exercises that involve memoization, a technique for avoiding repeated computation that, incidentally, will be important for the course final project.

B. Hashing Different Objects

Hashing Tic-tac-toe Boards

The code in TTThashTest.java stores all possible Tic-tac-toe boards into a java.util.HashSet object. Add methods to the TTTboard class that test two ways to hash Tic-tac-toe boards.

Toggle Solution

Hint about hashing a board as a nine-digit numeral:

This tic-tac-toe board ...

| X | X | O |
|---|---|---|
| X | O |   |
|---|---|---|
|   |   | O |

could be represented as ...

| 1 | 1 | 2 |
|---|---|---|
| 1 | 2 | 0 |
|---|---|---|
| 0 | 0 | 2 |

or as:

112120002 (base 3).

which is equal to:

1x3^8+ 1x3^7 + 2x3^6 + 1x3^5 + 2x3^4 + 0x3^3 + 0x3^2 + 0x3^1 + 2x3^0

which is equal to:

2x3^0 + 0x3^1 + 0x3^2 + 0x3^2 + 0x3^3 + 2x3^4 + 1x3^5  + 2x3^6 + 1x3^7 + 1x3^8

which is equal to:

2 + 3x(0 + 3x(0 + 3x(0 + 3x(2 + 3x(1 + 3x(2 + 3x(1 + 3x(1)))))))) 

Discussion: Two Hash Functions

Link to the discussion

Which hash function performed better? Suggest why.

Properties of Good Hash Codes (non-exhaustive list):

Discussion: Random Hash Code

Link to the discussion

If I made my hashCode function return a random int equally spread over all possible int s, the hash codes would be quick to compute and spread over the ints as perfectly as possible. Why is this not a good hash code?

Discussion: Linked List Hash Code

Link to the discussion

Imagine a simple LinkedList class that has ListNode objects inside. The LinkedList class has the following hashCode method:

public int hashCode() {
    if (myHead == null) {
        return 17;
    } else {
        return myHead.hashCode();
    }
}

This class essentially relies on the hash codes of its nodes. So, consider the following three options for the hashCode method to put inside ListNode . For each, do you think it is a good hashCode method? Evalutate it according to the criteria given above.

In the ListNode class:

1.

    public int hashCode() {
        int returnValue = myItem.hashCode();
        if (myNext != null) {
            returnValue += myNext.myItem.hashCode();
        }
        return returnValue;
     }

2.

    public int hashCode() {
        int returnValue = myItem.hashCode();
        if (myNext != null) {
            returnValue += myNext.hashCode();
        } 
        return returnValue;
    } 

3.

public int hashCode() {
    int returnValue = super.hashCode(); // the machine address of this ListNode
    if (myNext != null) {
        returnValue += myNext.hashCode();
    }      
    return returnValue;
}   

C. Hashing for Memoization

Implementing Memoization Using Hashing

In the next few steps you'll implement memoization for fibonacci and a puzzle where you try to measure water using inconveniently sized jugs. This method can also be used to solve games:

Take for example a game-playing program that makes moves. It will search the tree of moves to find winning configurations:

Search for a winning move.
    If the configuration has been seen before, 
        return its value.
    If it's an immediately losing configuration,
        store the configuration and "loss" in the table
        and return "loss".
    If it's an immediately winning configuration,
        store the configuration and "win" in the table
        and return "win".
    Otherwise, for each possible move, do the following:
        make the move;
        make a recursive call on Search;
        store the configuration and the value in the table;
        if the value is "win", return it;
        otherwise unmake the move and try another.
    If all moves have been analyzed and none are winners,
        return "loss".

Exercise: Implement Memoization for Fibonacci

Modify the Fibonacci.java code to use memoization. Memoization allows redundant recursive calls to be eliminated by saving the result of each new call and then looking up the result instead of recomputing it.

We recommend that you use a HashMap even though you could keep track of previously computed values with an array.

Exercise: The Jug Problem

First, switch which partner in the lab is coding, if you haven't recently.

Two CS 61B students have a jug filled with eight liters of water that they wish to divide evenly between them. They have two empty jugs with capacities of five and three liters respectively. These jugs are unmarked and no other measuring device is available. How can they accomplish the equal division?

jugs

The files JugSolver.java JugContents.java contains parts of a program to solve this problem. It recursively searches to find a sequence of steps (each step pours one jug into another) that produces the desired amount.

JugSolverTest.java contains some tests. The program works when zero or one pours are required, but the program has a flaw, namely that it doesn't keep track of configurations that it has seen before; infinite recursion results.

Without changing any of the existing code (except perhaps to change the value of the DEBUGGING variable), add statements that fix the program. In particular, you are to use a java.util.HashSet object to keep track of configurations already seen. Hint: Most of the time necessary on this problem will be reading and understanding the code.

--------

Suppose that 31 distinct integers are arranged in ascending order in an array named values. Suppose also that the following code is used in a method to locate an integer named key in the array.

int leftIndex = 0;
int rightIndex = values.length() - 1;
while (leftIndex <= rightIndex) {
    int middleIndex = (leftIndex+rightIndex)/2;
    if (values[middleIndex] == key) {
        return true;
    } else if (values[middleIndex] < key) {
        leftIndex = middleIndex+1;
    } else {
        rightIndex = middleIndex-1;
    }
}
return false;

The next two questions ask you to count the comparisons needed to find all 31 integers, and then to do the same thing with the values stored in a hash table.

Counting Binary Search Comparisons

The code in BinarySearch.java contains the code in the previous step, along with code to look up each element of the array. Modify the code to count the number of times a comparison is made between values[middleIndex] and key. (Correctness check: The total number of comparisons to find all the array elements is 227.)

Self-test: Counting hash table comparisons

Now suppose that the 31 integers are stored in a chained hash table with k chains, in which the hash function distributes the integers to chains as evenly as possible. We want to find every value in the hash table and count the number of comparisons necessary. What is the smallest value of k for which the total number of comparisons to find all 31 values in the hash table is less than the answer you computed in the previous part?

(Recall that 1 + 2 + ... + n = n(n+1)/2.)

Toggle Solution

Answer: 3

If you didn't get this right, try guess and check. Calculate how many elements are in each chain for 1 bucket (and the number of comparisons) then for 2 buckets (and the number of comparisons) etc.

D. Conclusion

Submission

Please turn in the following as lab13:

Readings

The reading over the next two labs is DSA chapter 7 (Tree Structures).