Lab 5: Collections, Iterators, and Meta-Iterators

Before We Begin

First, try your hand at this short quiz on polymorphism.

A. Introduction

In this lab we will be giving you to a small taste of the Java Standard Library. In particular, we will be looking at Collections, Iterators, and Iterables. Collections encompass many of the data structures that you will be working with this semester. Iterators are objects that control iteration through items in a collection via three methods: a constructor, hasNext (are there more items yet to be iterated over?), and next (return the next item in the iteration). Iterators are provided for every collection class in java.util. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.

By the end of this lab, you should be comfortable with searching through the Java Standard Library documentation and with using Iterators and Iterables and Collections.

B. Collections

Collection classes represent collections of data. Most of the data structures we will study the rest of this class are used to represent collections. At the most general level, pretty much anything you use to store multiple items at once is going to fufill the requirements to be a collection. The most commonly used collections are sets and lists, but there are many others. The hierarchy of all collection classes is organized using interfaces! As you will see soon, what that means is that sets and lists in Java are represented as interfaces that extend the Collection interface.

The Collection Interface

At the root of the organization is the Collection interface. This interface specifies methods like isEmpty(), add(Object o), and many others. All collection classes implement Collection in one way or another. One important type of collection is the List, which you have seen before. Implementations of Lists include the ArrayList and the LinkedList. Another important collection is the Set, which you will implement shortly.

The Set Interface

SetMarbles

A set is a group of items with no duplicates. The Set interface does not require ordering (e.g. HashSet), but sets can be ordered (TreeSet, ConcurrentSkipListSet). Sets support (at least) the following operations:

1. Initialization (with a constructor)
2. Adding an element to the set if it's not already there
3. Removing an element from the set
4. Checking whether a given item is in the set
5. Checking whether the set is empty

While a class implementing the Set interface must be able to support the above operations, the actual implementation and manipulation of the data is up to the class.

Exercise: how could you represent a set of nonnegative integers? (hint: use a boolean array!)

It's important to realize that a Set, although a sub-type of Collection, is itself an interface, and is defined like this:

    public interface Set extends Collection {
        ...
    }

Set is an interface because there are many types of sets that are implemented in a variety of ways. All of these Set subtypes fulfill the requirement of being a Set.

The List Interface

SequenceBunnies

Lists differ from sets in that elements have positions within in the List, duplicates are allowed, and all lists are ordered. Thus, they support the following operations:

1. Initialization (with a constructor)
2. Adding an element at a given position or at the end of the list
3. Removing an element at a given position or at the end of the list
4. Checking whether a given item is in the list
5. Identifying the value at a given position in the list
6. Checking whether the list is empty

Implementation of a list normally involves storing the elements of the list explicitly. One might use an array whose 0th element is the first list item, and so forth. You have implemented Lists through the IntList and IntDList. Similarly to a Set, a List is an interface that might be defined like this:

    public interface List extends Collection {
        ...
    }

Examples of Collections

Let's take a closer look at the declaration, instantiation, and utilization of Collections, which as we saw in lab3 is a bit syntactically strange. For example, to instantiate a list of ints, add the number 5, then get it back, we might write:

List<Integer> s;
s = new ArrayList<Integer>();
s.add(5);
int x = s.get(0); /* gets the five back */

There are three funny things about the syntax above:

As a final observation, note that we declared s to be a variable that points to a List object, however the actual object we created was specifically an ArrayList. Recall that List refers to an interface, and so it is not actually possible to create a List object. Instead, we had to choose a concrete class that actually implemented List, in this example the Java ArrayList class. Recall that a concrete class is a non-abstract class that implements all the methods it potentially inherits from any abstract superclasses or interfaces.

C. Building a Set

Create a new file SetDemo.java. Do not write tests. It should:

  1. Declare a Set that holds Strings.
  2. Instantiate this set by using the constructor of a specific Set implementation. For a list, go to the official Java documentation and look for "All Known Implementing Classes:". You'll need to Google/Bing/Altavista your way to the docs.
  3. Add the strings "papa", "bear", "mama", "bear", "baby", "bear".
  4. Print the set. You should observe there are only four items.

You should do all this within a main method in SetDemo.java. You will need the Java documentation to find a concrete set implementation to use, as well as the name of the method for putting things into a set. Do not ask a lab TA unless you are really stuck. The point here is to self-sufficiently be able to find this sort of information. Don't forget that all objects have a toString() method (not necessarily a useful one, but everything has one at least).

D. Introduction to Iterators

As we saw in lab 3, an iterator is an object whose purpose is to traverse the values in a collection of objects (here we mean the abstract notion of a collection, not necessarily a capital C Collection as defined in Java), yielding them one at a time to the iterator's client. The standard Java library defines a standard interface that iterators can implement:

package java.util;

public interface Iterator<Value> {
    /** Returns true iff this iterator has another value to deliver. */
    boolean hasNext();

    /** Returns the next value from this iterator, and advances to
     *  the next.
     *  Precondition: hasNext() must be true.
     */
     Value next();

     /** Remove the last value delivered by next() from the
      *  underlying collection of values being iterated over.
      *  Optional method.  May be called at most once per call to
      *  to next().
      */
      default void remove();
}

While in principle a collection could itself implement the iterator interface, this would result in very strange code. This is because the collection itself is not the iterator! Instead, collections that wish to support iteration typically provide an iterator() method that returns an appropriate Iterator (this is a part of the Iterable interface that's described more in detail below). Consequently, the collection will now have an iterator that is accessible by calling iterator().

For example, if L is a List<String>, you can write

for (Iterator<String> i = L.iterator(); i.hasNext();) {
    String value = i.next();
    System.out.print(value + " ");
}

This code would print out each item in the list, one at a time. An alternate way of writing this code is as follows:

Iterator<String> i = L.iterator();
while (i.hasNext()) {
    String value = i.next();
    System.out.print(value + " ");
}

In the code above, the List class provides an iterator method that returns an object of type Iterator<String>. How do you define an Iterator exactly? In short, implement the Iterator interface defined above. For the object you wish to define as an iterator, declare implements Iterator in its class definition, and provide a next(), a hasNext(), and a remove() method. Let's take a look at the documentation for the Iterator interface here and note some key details about the three required methods.

Finally, let's see this in action. Java abstracts types such as LinkedList and ArrayList using the List interface. You can view the source code of the List interface here if interested!

Java then provides an AbstractList class that provides default implementations for many List methods (much like what we saw in Lecture 11 with the Reader interface). Looking at the source of AbstractList.iterator(), we see that the iterator() method here returns a new object of type Itr, where Itr is a nested non-static class (nested non-static classes are called inner classes; see Lecture 13).

Finally, take a look at the source code for the Itr inner class. This will show you exactly how the iterator for things like LinkedList and ArrayList works. Don't worry too much about the exact implementation details, do observe that it implements the Iterator interface and thus has a next(), hasNext(), and remove() method as required by the interface! If you want to dig even deeper you can consider how Itr keeps track of the current element that calling next() will return.

E. The Iterable interface

Iterating through interfaces using next and hasNext would be tedious to write every time, and thus Java has introduced a special syntax for this Java idiom known as the Iterable interface. If L has a type that implements Iterable<String> (as List<String> does), then the loops above may be written

for (String value: L) {
   System.out.print(value + " ");
}

This is called the enhanced for loop. It is very similar to the for loops that you saw in Python

The Iterable interface is really simple. A class that implements it must simply provide an iterator method that returns an Iterator.

package java.lang;
public interface Iterable<Value> {
    /** Returns an iterator over my values. */
    Iterator<Value> iterator();
}

An Iterator for FixedSizeList

Here's another example of how Iterable and Iterator are used in practice. Suppose we had a class FixedSizeList that represents a list that stores its values in an integer array. Remember that we ask the FixedSizeList for an Iterator instance through the iterator() method. This then returns a FixedSizeListIterator, which is an inner class we've defined to iterate through our FixedSizeList.

import java.util.Iterator;

public class FixedSizeList implements List<Integer>, Iterable {

    // instance variables
    protected int[] values;   // list elements
    int count;                // number of array cells used by list

    // private Iterator class
    private class FixedSizeListIterator implements Iterator<Integer> {

        int nextIndexToReturn;    // index of next item to return by iterator

        public Integer next() {
            int valToReturn = values[nextIndexToReturn];
            nextIndexToReturn++;
            return valToReturn;
        }

        public boolean hasNext() {
            return nextIndexToReturn < count;
        }
    }

    public Iterator<Integer> iterator() {
        return new FixedSizedListIterator();
    }

        ...
}

We could build an FixedSizeList as follows.

public static void main(String [] args) {
    FixedSizeList list = new FixedSizeList(10);
    list.add(5);
    list.add(3);
    // list now contains the integers 5, 3;
    // thus values[0] is 5, values[1] is 3,
    // and count is 2.
        ...
}

Then we can use an iterator.

Iterator<Integer> iter = list.iterator();
int m;
m = iter.next();
// m now contains 5 and nextIndexToReturn contains 1
m = iter.next();
// m now contains 3 and nextIndexToReturn contains 2

At this point, another call to next would be invalid since nextIndexToReturn is past the end of the list values.

The code maintains an important invariant: prior to any call to next, nextIndexToReturn contains the index of the next value in the list to return.

In Summary

An Iterator is an object that provides a next(), hasNext(), and remove() method. Iterators are intended to provide a way to step through some sort of data structure one item at a time in some fashion. If you have access to an Iterator to some collection, you can use the next() and hasNext() methods to go through it step by step.

Using next() and hasNext() is tedious, so if a class declares itself as implementing Iterable, you can use the : operator to iterate instead. To follow the Iterable contract, the class that implements Iterable must provide a method that provides an Iterator that allows iteration through that object.

Food for thought: Why doesn't the List class just implement Iterator itself?

F. Programming Task: EveryOtherWord.java

Fill in the everyOtherWord method of EveryOtherWord.java such that it performs the task described in its comment and passes the test in main. Do not change anything in the file other than the everyOtherWord method, except that you may import the Iterator class (though this is not necessary). You shouldn't need any helper methods.

G. Programming Task: Meta-Iteration through Filters

As we saw in EveryOtherWord, we don't always want to iterate in exactly the way that the available iterator tells us. One approach is to write special purpose methods (e.g. everyOtherWord) that generate a new Iterable that contain all the items we require. This is fine as long as we have the memory and time to spend on building this iterable, which may potentially be quite large and slow to construct.

An alternate approach is to use an intermediary class that will feed us the items we want, one at a time. These are often referred to as filters.

Open up the Filter abstract class in the utils package. As the class implements Iterator and Iterable, it has the next, hasNext, and iterator methods. Recall that implementing Iterable indicates that Filter should be able to provide an iterator object over its items, and since Filter implements Iterator, it is an iterator, and so just returns itself!

You'll see that Filter acts as a middleman between client code and some Iterator. Specifically, if you look at the hasNext() method, you'll see that it keeps calling next() on the underlying Iterator until the keep() method returns true.

Concrete implementations of the Filter class will implement a keep() method that results in the desired behavior. For example, a filter designed to only return every other item would return true for every other call to keep.

Start by completing the implementations of TrivialFilter and AlternatingFilter so that they pass the TestFilters test. You might find the FilterClient class to be useful for manually testing your implementations.

As two optional but highly recommended exercises, complete an implementation of MonotonicFilter and PredicateFilter that passes their TestFilters tests. There is also an everyFourth method in FilterClient that you should feel free to implement. Note that PredicateFilter is a kind of filter that takes in a Predicate object (as defined in the utils package) as the first argument in its constructor.

H. Submission

Make sure to have completed the following

You need not submit anything for the quiz. The rest of the files can be submitted via the usual commands for committing, tagging, and pushing. Remember that you have a midterm coming up on 9/27/2017! We suggest starting to take a look at past exams, which can be found under the resources tab of the course website.

I. Optional Assignments

We've included a few optional assignments that provides extra practice with iterators and iterating.

Check them out if you're interested but make sure to also balance studying for the midterm!