Lab 8: Generics and Iterators

A. Intro

Download the code for Lab 8 and create a new Eclipse project out of it.

Learning Goals

This is a two-part lab. The first part introduces you to generic classes, interfaces, and methods. These appear throughout the java.util class library. You will be using them for your own code as well. We intend that the lab activities demystify the concept of a generic class and method and reduce the likelihood of puzzling syntax errors.

Remaining activities address the use and implementation of iterators, objects that control iteration through the items in a collection via three methods: a constructor, hasNext(are there more items yet to be iterated?, and next (return the next item in the iteration). Iterators are provided for every collection class in java.util. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.

B. Generics

Intro to Generics

We noticed in recent lab activities that Java use both an item's static type and its dynamic type to determine how a given reference should be interpreted. The static type of a variable constrains types values it may contain, and is determined by what the compiler sees. The dynamic type is the type of the variable's value. Here's an example.

    ModNCounter ctr1 = new ModNCounter(4);
    Counter ctr2 = new ModCounter(8);

ctr1 has static type ModNCounter and ctr2 has static type Counter because that was the information given to the compiler. Both variables have dynamic type ModNCounter because that's the type of the contents of ctr1 and ctr2.

Suppose now that we wish to store a collection of counters, some with type Counter and some with type ModNCounter. Suppose further that we wish to store the counters in an ArrayList.

The ArrayList class can store a collection of any type of Java Object. Since any class inherits from Object, you might use it to store a collection of Strings, a collection of Integers, a collection of Dogs, or even a collection of other ArrayLists, and of course a collection of counters. Because you could potentially receive any type of Object when you access an ArrayList element, you have to cast the result of get if you don't want it to have static type Object.

ArrayList collection = new ArrayList();
collection.add(new ModNCounter(5));
ModNCounter ctr = (ModNCounter) collection.get(0); // Without the cast it's a compile time error!

This is rather messy.

A generic class lets us clean up this mess. The online Java Tutorial (http://docs.oracle.com/javase/tutorial/java/generics/why.html) says why:

"In a nutshell, generics enable types to be parameters when defining classes and methods. Much like the more familiar formal parameters used in method declarations, type parameters provide a way for you to re-use the same code with different inputs. The difference is that the inputs to formal parameters are values, while the inputs to type parameters are types."

Here's the rewritten code.

ArrayList<ModNCounter> collection = new ArrayList<ModNCounter();
collection.add(new ModNCounter(5));
ModNCounter ctr = collection.get(0);

Writing a Generic Class

Hopefully using a generic class is fairly intuitive, since you've already had some practice with ArrayList. Let's see how to define our own generic class. Suppose we have a class Box that can store one object of any type, using generics. It's a contrived example — the Box class isn't very useful — but hopefully it gives you an example of genrics syntax. Here's the way we imagine using the Box class:

Box<String> b1 = new Box<String>();
b1.putItem("abc");

Box<Point> b2 = new Box<Point>();
b2.putItem(new Point(1, 2));

String s = b1.getItem();
Point p = b2.getItem();

Notice that we can put and get String objects from a Box<String> object, or we could put and get Point objects from a Box<Point> object.

Now we'll show you how you could write this class:

public class Box <T> {
    // instance variable declaration
    private T myItem;

    // put an item in the box
    public void putItem(T item) {
        myItem = item ;
    {

    // get the item from the box
    public T getItem() {
        return myItem ;
    }
}

What is this T that we see appearing all over the class definition? It looks like its being used as a type. You can see it is the type of myItem, it is the return type of getItem, and it is the type of the input to putItem. But you've never heard of a Java class called T, right?

That's because T is a generic type. You can think of a generic type as a variable type. When you create a box, you can "pass in" other types to take the place of T. When we write Box b1 = new Box<String>();, String will take the place of T everywhere. So you can imagine that for b1, its myItem has type String, its getItem returns type String, and its putItem takes an input of type String. For Box b2 = new Box<Point>();, everything is in terms of the Point class intead.

T can be thought of as just a variable, so you can give it whatever name you want. You could have written E instead of T, or even something like ItemType. It is convention to name generic types with single uppercase letters, however.

To declare that a class is generic and that it will be using something like T as a generic type, all you do is include the <T> in the class header.

Self-test: Using Generics

Let's test that you understand how to use generics. For each of the following snippets of code, answer whether or not they will compile. Imagine that these lines of code exist in some file outside the Box class.

Box<String> b1 = new Box<Point>();
Yes
Incorrect.
No
Correct! Although the box class can take any type, each box object is committed to using only one.
Check Solution
Box<String> b2 = new Box<String>();
b2.putItem(new Point(1, 2));
Yes
Incorrect.
No
Correct! Since b2 is declared to take only String s, it cannot take a Point as input.
Check Solution
Box<T> b3 = new Box<String>();
Yes
Incorrect.
No
Correct! T is a variable type that exists only inside of the Box class. Since this code is not inside of the Box class, it makes no sense to reference T , because there is no way to tell what type T should be.
Check Solution
Box<Point> b4 = new Box<Point>();
b4.putItem(new TracedPoint(3, 4));
Yes
Correct! b4 is meant to take only Point objects. A TracedPoint is a Point , so it is allowed to be passed in.
No
Incorrect.
Check Solution
Box<Point> b5 = new Box<TracedPoint>();
Yes
Incorrect.
No
Correct! It's true that the code Point p = new TracedPoint(3, 4); does work, because Point is a superclass of TracedPoint . But, Box<Point> is not considered a superclass of Box<TracedPoint> . Think about why: If you could do this line, then the compiler would think you should be allowed to put any Point object into b5 , when really you can only put a TracedPoint .
Check Solution

Generic Interfaces

In addition to generic classes, we can also make generic interfaces. The syntax for writing a generic interface is exactly the same as that for writing a generic class, so if you understood the former, you should be able to transfer that knowledge to the latter.

Exercise: Make IntSequence Generic

Create a new class called Sequence that is just like IntSequence, but generic: it can now hold any kind of item, not just integers. Check that you can make a sequence of Strings or Integers and that the methods work in both cases.

One caveat. You can't create an array with a generic dynamic type. Only its static type can be generic. So myValues should have a generic static type, but a dynamic type of Object[]. This is just an odd special case with arrays; don't worry too much about it. Other classes don't work like that. Notice that as long as the static type is generic, you still get all the benefits of generics.

C. Intro to Iterators

Iterators Defined

In CS 61BL, we're going to encounter a variety of different data structures, or ways to organize data. You've already seen the array, which we used to represent a sequence, an ordered list of elements. You also saw how an array could be used to represent a set, an unordered collection of elements. Later, in addition to sets and sequences, we'll see more complicated data structures such as trees, maps, priority queues, and graphs.

A common operation on a data structure is to process every item in it. For the sequence, it was relatively straightforward how to do this. We wrote a for loop:

for (int k = 0; k < length of the array; k++) {
    process the kth array element...
}

However, if we use a different data structure, a for loop like the one above may not make sense. For example, what would it mean to access the kth item of a set, or of a tree? We need a more abstract notion of processing every item in a data structure, something that allows us to check every item regardless of how they're organized. To do that, we're going to use something called an iterator.

Here's an example of how we might want to use an iterator on some arbitrary data structure:

Iterator iter = initIterator();
while (iter.hasNext()){
    value = iter.next();
    process value in some way...
}

The idea is that we process an item, then ask for the next one, then repeat until there is no next item. next doesn't necessarily return items in a specific order, like our old for loop, because the notion of linear order may not even apply to our data structure. The only thing next guarantees is that it will return every item exactly once, somehow.

We have abstracted away a lot behind the next method — if our data structure is very complicated, next may have to do a lot of work to figure out what item should be returned next.

Iterator Methods

Let's look at the iterator methods we used in the last step in more detail.

In order for next to know what value to return next, there must be some place-keeping information that keeps track of how far along the iteration is. The iterator may need an extra instance variable or two for this.

Exercise: Using an Iterator

Let's practice by using iterators. All java collections have one. Check out the class PointUtils, and fill it out. Complete the methods using the given iterator — Don't use a for loop!

It's okay if you're unfamiliar with the details of the List class or the LinkedList class that appear in the code. The point of an iterator is that you don't have to understand how these classes work in order for you to use the iterator. You just have to be familiar with the iterator methods.

An Iterator for IntSequence

Now that you've had some practice using an iterator, let's see how we can write our own. Here's an example of inserting the iterator methods into the IntSequence class:

public class IntSequence {
    int nextIndexToReturn;
    int[] myValues; // values in the sequence
    int myCount; // number of values in the sequence

    public void initIterator() {
        nextIndexToReturn = 0;
    }

    public int next() {
        int valToReturn = myValues[nextIndexToReturn];
        nextIndexToReturn++;
        return valToReturn;
    }

    public boolean hasNext() {
        return nextIndexToReturn < myCount;
    }
        ...
}

We could build an IntSequence as follows.

public static void main(String [] args) {
    IntSequence seq = new IntSequence(10);
    seq.add(5);
    seq.add(3);
    // seq now contains the sequence 5, 3;
    // thus values[0] is 5, values[1] is 3,
    // and myCount is 2.
        ...
}

Then we can use an iterator.

seq.initIterator();
int m;
m = seq.next();
// m now contains 5 and nextIndexToReturn contains 1
m = seq.next();
// m now contains 3 and nextIndexToReturn contains 2

At this point, another call to next would be invalid since nextIndexToReturn is past the end of the sequence values.

The code maintains an important invariant: prior to any call to next, nextIndexToReturn contains the index of the next value in the sequence to return.

The hasNext Method

The hasNext method, as noted earlier, warns if the iteration runs out of elements. Most of the time we expect people to call hasNext() between calls to next(), but we can't count on it. The code on the previous page, which did not include a call to hasNext, is perfectly legal and works fine provided the sequence contains at least two elements.

An important feature of the code is that hasNext doesn't change any state. It only examines existing state (by comparing the progress of the iteration to the number of sequence elements). hasNext can then be called twice in a row and nothing should change, or it could be called not at all and the iteration should still work.

For example, if hasNext() returns true, we should be able to do this:

seq.initIterator();
seq.hasNext();
seq.hasNext();
seq.hasNext();
int m = seq.next();

The fact that we called hasNext() multiple times shouldn't change anything.

Self-test: Iterators

Consider the following code, intended to be included in the IntSequence class. Note: this code is intentionally gross.

private int iteratorIndex = 0;
private boolean done = false;

public void initIterator() {
}

public boolean hasNext() {
    if (done) {
        return false;
    }
    if (iteratorIndex == myCount-1) {
        done = true;
    }
    return true;
}

public int next() {
    int rtn = myValues[iteratorIndex];
    iteratorIndex++;
    return rtn;
}

Suppose also that the IntSequence main method begins as follows.

public static void main (String [ ] args) {
    IntSequence seq = new IntSequence(1);
    seq.add(5);
    _______ ;

This self-test involves identifying the program segments that, when substituted for the blank line in the main method, perform correctly. (Note that "correct" behavior may involve a crash.)

seq.initIterator();
boolean b;
int n;
b = seq.hasNext();
b = seq.hasNext();

Performs correctly?

Yes
Incorrect. Examine hasNext more carefully
No
Correct! hasNext changes the state. It will do something different the second time, which shouldn't happen
Check Solution
seq.initIterator();
if (seq.hasNext()) {
    System.out.println(seq.next());
}
if (seq.hasNext()) {
    System.out.println(seq.next());
}

Works correctly?

Yes
Correct! Only the first if statement evaluates to true, as it should
No.
Incorrect.
Check Solution
seq.initIterator();
System.out.println(seq.next());
System.out.println(seq.next());

Works correctly?

Yes
Correct! It crashes, as it should.
No
Incorrect.
Check Solution
seq.initIterator();
if (seq.hasNext()) {
    System.out.println(seq.next());
}
seq.initIterator();
if (seq.hasNext()) {
    System.out.println(seq.next());
}
Yes
Incorrect. Does initIterator really reset the iteration, as it should?
No
Correct! Notice initIterator does not reset the value of done
Check Solution
seq.initIterator();
boolean b;
int n;
n = seq.next();
b = seq.hasNext();

Works correctly?

Yes
Incorrect. Trace through the calls to next and hasNext more carefully
No
Correct! hasNext will return true when it shouldn't
Check Solution

Discussion: Iterator Invariants

Link to the discussion

Consider the following iterator methods, slightly different from those we just encountered.

public void initIterator() {
    index = -1;
}

public int next() {
    index++;
    return values[index];
}

public boolean hasNext() {
    return (index + 1) < myCount;
}
  1. What's the invariant relation that's true between calls to next?
  2. In general, most experienced programmers prefer the organization introduced first over this organization. What might explain this preference?

Exercise: An iterator for Set

First, if you haven't yet switched which partner is coding this lab, please switch now.

In a lab activity from an earlier lab, we worked with the Set class that represented a set of positive integers using a boolean array.

Write an iterator for Set by adding the following methods to the Set class:

public void initIterator()
public boolean hasNext()
public int next()

FAQ:

Q: What if someone calls next() when hasNext() returns false?

A: This violates the contract of your code and you can crash or do whatever you want. We won't be testing this.

Q: Will initIterator() always be called before calling hasNext() or next() for the first time?

A: Yes.

Q: Will hasNext() always be called before next()?

A: No.

D. Using Java's Iterator and Iterable

So far in this lab, you've been adding iterator methods directly to the classes you were iterating over. For example, you added hasNext and next directly as methods of the Set class. This is a simplistic approach, and we introduced it first just for pedagogical reasons. The more common approach is to use a separate iterator object that has those methods. But first, some motivation:

A More Abstract for Loop

You may have felt that writing while loops that repeatedly call next and hasNext is a bit of a hassle. After all, iterating over all items in a list in Python is comparatively much easier. Here's an example of printing every item in a list in Python:

l = ["a", "b", "c", "d"]
for item in l:
    print(item)

It turns out there is similar syntax in Java! Here's an example of it in use:

String[] l = {"a", "b", "c", "d"};
for (String item : l){
    System.out.println(item);
}

This is a different kind of for loop than you have seen before. The for loop here is doing the exact same thing as in Python. It iterates through every item in l, temporarily storing each one in a variable called item (with static type String) on each iteration.

You may remember that you can use Python's for loop to iterate over many different kinds of objects. For example, this code does the same as above,

for letter in "abcd"
    print(letter)

Somehow, Python knows what the right thing to do is whether it is iterating over a list or a string. It can iterate over many other things as well.

Java's for loop can also be used on data structures other than arrays. Here's an example on an ArrayList.

ArrayList<String> l = new ArrayList<String>();
l.add("a");
l.add("b");
l.add("c");
l.add("d");

for (String item: l){
    System.out.println(item);
}

Notice how this for loop abstracts away the underlying data structure. It works the same with arrays and ArrayLists! With a little set up, it turns out we'll be able to use this for loop on lots of other data structures too.

Iterable

We saw in the previous slide that Java has a for loop that can be used to process items in different kinds of data structures. But how does Java know how to iterate over different data structures, including complicated ones? It turns out that, in order to use Java's for loop to iterate over an object of a certain class, that class must implement the interface Iterable.

To implement Iterable, a class must define the following method:

public Iterator iterator();

This is the same method you saw being used in the PointUtils exercise. The iterator() method intializes an iteration by returning an object of type Iterator, which can then be used to iterate. What is Iterator? It turns out this is another interface!

(Note: Iterable and Iterator are generic interfaces. In the above example, our iteration returned items of static type String; this means the ArrayList<String> actually implemented Iterable<String> and defined the method public Iterator<String> iterator();).

Here are the methods required for an object to implement the interface Iterator<E>. Do they look familiar? These are just the methods you used in the PointUtils exercise.

public boolean hasNext();
public E next();

If you're using Java 7 rather than Java 8, you will also need the following method (it is optional in Java 8):

void remove(); //this one might not be familiar. More details below

When you use Java's special for loop, it makes use of these methods automatically behind the scenes, so you don't have to go through the trouble of calling them yourself.

(By the way, the method remove() deletes from the collection the last element returned by the iterator. You'll notice using the clean for loop syntax won't ever use the method remove(). The method is just there for you to call manually if you need it. When writing your own code, you may feel like you don't need the remove method either. That's okay: many Java classes just throw an exception in case someone calls the method, claiming that the class doesn't actually implement the method.)

Exercise: Make Sequence Implement Iterable

Modify the Sequence<T> class that you've been working with so that it implements Iterable<T>. Hint: You're going to have to define a new class that implements Iterator<T>. We recommend defining this as a nested private class, since it is only useful within Sequence. Nested classes can access the instance variables of the class they're nested in.

If you're using Java 8, feel free to ignore the remove method. If you're using Java 7, just have the remove method throw an UnsupportedOperationException.

E. Conclusion

Summary

In our experience, the biggest problem encountered by users of generics are syntax errors. We hope that lab activities alerted you to the problems.

Various aspects of iterators cause trouble, mainly with the maintenance of the internal state of the iteration. The constructor initializes the state, typically to the position of the first item to be enumerated. Subsequent processing maintains a data invariant that relates the position of the most recently returned element to the number of items in the collection being enumerated. And a detail: the hasNext method should never change the state of the iterator.

Submission

Submit the following files as lab08:

Readings

From now on, reading will be from the data structures textbook rather than Head First Java. This reading is optional. We recommend reading to prepare yourself for the next lab, if you feel like you have a difficult time digesting everything in lab.

The reading for the next lab is Data Structures and Algorithms chapter 4.