Lab 5: Collections, Iterators, and Meta-Iterators

Before We Begin

First, try your hand at this short quiz on polymorphism.

A. Introduction

In this lab we will be giving you to a small taste of the Java Standard Library. In particular, we will be looking at Collections, Iterators, and Iterables. Collections encompass many of the data structures that you will be working with this semester. Iterators are objects that control iteration through items in a collections via three methods: a constructor, hasNext(are there more items yet to be iterated over?), and next (return the next item in the iteration). Iterators are provided for every collection class in java.util. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.

By the end of this lab, you should be comfortable with searching through the Java Standard Library documentation and with using Iterators and Iterables and Collections.

B. Collections

Collection classes represent collections of data. Most of the data structures we will study the rest of this class are used to represent collections. At the most general level, pretty much anything you use to store muliple items at once is going to fufill the requirements to be a collection. The most commonly used collections are sets and lists, but there are many others. The hierarchy of all collection classes is organized using interfaces!

The Collection Interface

At the root of the organization is the Collection interface. This interface specifies methods like isEmpty(), add(Object o), and many others. All collection classes implement Collection in one way or another. One important type of collection is the List, which you have seen before. Implementations of Lists include the ArrayList and the LinkedList. Another important collection is the Set, which you will implement shortly.

The Set Interface

SetMarbles

A set is a group of items with no duplicates. The Set interface does not require ordering (e.g. HashSet), but sets can be ordered (TreeSet, ConcurrentSkipListSet). Sets support (at least) the following operations:

1. Initialization (with a constructor)
2. Adding an element to the set if it's not already there
3. Removing an element from the set
4. Checking whether a given item is in the set
5. Checking whether the set is empty

For example, a set of nonnegative integers may be represented as a boolean array with element k being true if k is in the set and false otherwise, but there are many other ways to implement a Set.

It's important to realize that a Set, although a sub-type of Collection, is itself an interface, and is defined like this:

    public interface Set extends Collection {
        ...
    }

Set is an interface because there are many types of sets that are implemented in a variety of ways. All of these Set subtypes fulfill the requirement of being a Set.

The List Interface

SequenceBunnies

Lists differ from sets in that elements have positions within in the List, duplicates are allowed, and all lists are ordered. Thus, they support the following operations:

1. Initialization (with a constructor)
2. Adding an element at a given position or at the end of the list
3. Removing an element at a given position or at the end of the list
4. Checking whether a given item is in the set
5. Identifying the value at a given position in the list
6. Checking whether the list is empty

Implementation of a list normally involves storing the elements of the list explicitly. One might use an array whose 0th element is the first list item, and so forth. You have implemented Lists through the IntList and IntDList. Similarly to a Set, a List is an interface that might be defined like this:

    public interface List extends Collection {
        ...
    }

Examples of Collections

Let's take a closer look at the declaration, instantiation, and utilization of Collections, which as we saw in lab3 is a bit syntactically strange. For example, to instantiate a list of ints, add the number 5, then get it back, we might write:

List<Integer> s;
s = new ArrayList<Integer>();
s.add(5);
int x = s.get(0); /* gets the five back */

There are three funny things about the syntax above:

C. Building a Set

Create a new file SetDemo.java. Do not write tests. It should:

  1. Declare a Set that holds Strings.
  2. Instantiate this set by using the constructor of a specific Set implementation. For a list, go to the official Java documentation and look for "All Known Implementing Classes:". You'll need to Google/Bing/Altavista your way to the docs.
  3. Add the strings "papa", "bear", "mama", "bear", "baby", "bear".
  4. Print the set. You should observe there are only four items.

You will need the Java documentation to find a concrete set implementation to use, as well as the name of the method for putting things into a set. Do not ask a lab TA unless you are really stuck. The point here is to self-sufficiently be able to find this sort of information. Don't forget that all objects have a toString() method (not necessarily a useful one, but everything has one at least).

D. Introduction to Iterators

As we saw in lab 3, an iterator is an object whose purpose is to traverse the values in a collection of objects (here we mean the abstract notion of a collection, not necessarily a capital C Collection as defined in Java), yielding them one at a time to the iterator's client. The standard Java library defines a standard interface that iterators can implement:

package java.util;

public interface Iterator<Value> {
    /** Returns true iff this iterator has another value to deliver. */
    boolean hasNext();

    /** Returns the next value from this iterator, and advances to
     *  the next.
     *  Precondition: hasNext() must be true.
     */
     Value next();

     /** Remove the last value delivered by next() from the
      *  underlying collection of values being iterated over.
      *  Optional method.  May be called at most once per call to
      *  to next().
      */
      default void remove();
}

While in principle a collection could itself implement the iterator interface, this would result in very strange code. Instead, collections that wish to support iteration typically provide an iterator() method that returns an appropriate Iterator.

For example, if L is a List<String>, you can write

for (Iterator<String> i = L.iterator(); i.hasNext();) {
    String value = i.next();
    System.out.print(value + " ");
}

This code would print out each item in the list, one at a time. An alternate way of writing this code is as follows:

Iterator<String> i = L.iterator();
while (i.hasNext()) {
    String value = i.next();
    System.out.print(value + " ");
}

In the code above, the List class provides an iterator method that returns an object of type Iterator<String>. But what is an Iterator exactly? Well, it's simply an instance of a class that implements the Iterator interface defined above, i.e. implements Iterator, and concomitantly provides a next(), hasNext(), and remove() method.

In fact, we can go look at the source code of the List interface, but of course since List is an interface, this method is abstract and we won't learn much.

Java provides an AbstractList class that provides default implementations for many List methods (much like we saw in Lecture 11 with the Reader interface). Looking at the source of AbstractList.iterator(), we see that this method just returns a new object of type Itr, where Itr is some nested non-static class (non-static nested classes are called inner classes; see Lecture 12).

If we look at the source code for the Itr inner class we can see its definition. I don't expect this code to make a great deal of sense to you, but you should observe that it implements the iterator interface and does have a next(), hasNext(), and remove() method as required by the interface.

D. The Iterable interface

Iterating through interfaces using next and hasNext would be tedious to have to write every time, and thus Java has introduced a special syntax for this Java programming idiom known as the Iterable interface. If L has a type that implements Iterable<String> (as List<String> does), then the loops above may be written

for (String value: L) {
   System.out.print(value + " ");
}

This is called the enhanced for loop. It is very similar to the for loops that you saw in Python

The Iterable interface is really simple. A class that implements it must simply provide an iterator method that returns an Iterator.

package java.lang;
public interface Iterable<Value> {
    /** Returns an iterator over my values. */
    Iterator<Value> iterator();
}

In Summary

An Iterator is an object that provides a next(), hasNext(), and remove() method. Iterators are intended to provide a way to step through some sort of data structure one item at a time in some fashion. If you have access to an Iterator to some collection, you can use the next() and hasNext() methods to go through it step by step.

Using next() and hasNext() is tedious, so if a class declares itself as implementing Iterable, you can use the : operator to iterate instead. To follow the Iterable contract, the class that implements Iterable must provides a method that provides an Iterator that allows iteration through that object.

Food for thought: Why doesn't the List class just implement Iterator itself?

E. Programming Task: EveryOtherWord.java

Fill in the everyOtherWord method of EveryOtherWord such that it performs the task described in its comment and passes the test in main(). Do not change anything in the file other than the everyOtherWord method, except that you may import the Iterator class (though this is not necessary). You shouldn't need any helper methods.

F. Programming Task: Meta-Iteration through Filters

As we saw in EveryOtherWord, we don't always want to iterate in exactly the way that the available iterator tells us. One approach is to write special purpose methods (e.g. everyOtherWord) that generate a new Iterable that contain all the items we require. This is fine as long as we have the memory and time to spend on building this iterable, which may potentially be quite large and slow to construct.

An alternate approach is to use an intermediary class that will feed us the items we want, one at a time. These are often referred to as filters.

Open up the Filter abstract class in the utils package. You'll see that it acts as a middleman between client code and some Iterator. Specifically, if you look at the hasNext() method, you'll see that it keeps calling next() on the underlying Iterator until the keep() method returns true.

Concrete implementations of the Filter class will implement a keep() method that results in the desired behavior. For example, a filter designed to only return every other item would return true for every other call to keep.

Start by completing the implementations of TrivialFilter and AlternatingFilter so that they pass the TestFilter test. You might find the FilterClient class to be useful for manually testing your implementations.

As an optional but highly recommended exercise, complete an implementation of MonotonicFilter that passes the TestFilter test. There is also an everyFourth method in FilterClient that you should feel free to implement.

G. Submission

You should have completed

Optionally you should have completed MonotonicFilter.java and FilterClient.java. You can submit as usual by committing, tagging, and pushing.

Remember that you have a midterm coming up! We suggest starting to take a look at past exams, which can be found under the resources tab of the course website.