Due Date: Friday 2/21 11:59PM. Labs are always due Friday the week they are released unless otherwise stated.
In this lab we will be giving you a small taste of the Java Standard Library as a means for understanding Object Oriented Programming. In particular, we will be looking at the Collection
, Iterators
, and Iterable
interfaces and several classes which implement these. Collections encompass many of the data structures that you will be working with this semester (we will later talk about how they are implemented, but for now we will just use them). Iterators are objects that control iteration through items in a collection via two methods: hasNext
(are there more items yet to be iterated over?), and next
(return the next item in the iteration). Iterators are provided for every class that implements Collection
in java.util
. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.
We will first introduce these interfaces and some of their subclasses, and then you will work with them in a few exercises. By the end of this lab, you should be comfortable with searching through the Java Standard Library documentation and with using Collection, Iterators, and Iterables.
Table of Contents
A. Collections
The Collection Interface
The interface Collection
represents any collection of data. As above, most of the data structures we will study in the rest of this class are used to represent collections of data. At the most general level, pretty much anything you use to store multiple items at once is going to fufill the requirements to be a Collection
. The most commonly used collections are either a Set
or a List
, but there are many others. The hierarchy of all Collection
classes is organized using interfaces! As you will see soon, what that means is that sets and lists in Java are represented as interfaces that extend the Collection
interface. A representation of this hierarchy can be found below (image sourced from here).
At the root of the organization is the Collection
interface. This interface specifies methods like isEmpty()
, add(Object o)
, and many others. All collections must implement Collection
in one way or another (i.e. they could directly implement it or they might extend / implement another class which implements Collection
). For example, you have probably already used ArrayList
or LinkedList
which implement the List
interface which is itself extended from the Collection
interface.
These collections and more are found in the package java.util
. A package is a set of (supposedly related) classes and subpackages. As elsewhere in Java, the notation java.util.ArrayList
means "The class named ArrayList
in the (sub)package named util
in the package named java
". To make your life easier, you'll want to import them before you can use them as follows:
import java.util.ArrayList;
import java.util.LinkedList;
(otherwise, you will have to write java.util.ArrayList
everywhere instead of just ArrayList
). Luckily for you, IntelliJ will typically import these for you when you type them into the editor.
The Set Interface
A set is a group of items with no duplicates. The Set
interface does not require ordering (e.g. HashSet
), but sets can be ordered (TreeSet
, ConcurrentSkipListSet
). Set
s support (at least) the following operations:
add(E e)
: Adding an element to the set if it's not already there.remove(Object o)
: Removing an element from the set.contains(Object o)
: Checking whether a given item is in the set.isEmpty()
: Checking whether the set is empty.
While a class implementing the Set
interface must be able to support the above operations, the actual implementation and manipulation of the data is up to the class implementing the Set
interface. Regardless of how any subclass of Set
is defined you will be able to use it in the same way which is the beauty of Interfaces / Object Oriented Programming. All of these Set
subtypes fulfill the requirement of being a Set
.
Discuss with your lab partner how could you represent a set of nonnegative integers. (Hint: How could you use a boolean array?)
It's important to realize that a Set
, although a subtype of Collection
interface, is itself an interface. Yes, an interface can extend another interface. It is defined like this:
public interface Set extends Collection {
...
}
Note: Interfaces extend
other interfaces, while classes implement
interfaces.
The List Interface
A list is an ordered group of items (also known as a sequence). The List
interface therefore differs from Set
in that elements have positions within a List
and duplicates are allowed. List
s support (at least) the following operations:
add(E e)
: Adding an element at the end of the list.add(int index, E e)
: Adding an element at a given position.remove(Object o)
: Removing the first occurence of an element in a list.remove(int index)
: Removing the element at a given position.contains(Object o)
: Checking whether a given item is in the list.get(int index)
: Identifying the value at a given position in the list.isEmpty()
: Checking whether the list is empty.
The implementation of a list normally involves storing the elements of the list explicitly. For example one might use an array whose 0th element is the first list item, and so forth. You have worked with and implemented the IntList
and IntDList
classes. These themselves did not extend either List
or Collection
but give you an idea of what an implementation of a List
might look like.
Similar to Set
interface, List
is an interface that might be defined like this:
public interface List extends Collection {
...
}
Collections Example
Now that we have introduced these interfaces, let's take a closer look at the declaration, instantiation, and utilization of Collection
s. For example, to instantiate a List
of int
s, add
the 5 to the list, then get it back, we might write:
List<Integer> s;
s = new ArrayList<Integer>();
s.add(5);
int x = s.get(0); /* gets the five back */
There are a few funny things about the syntax above:
- Angle bracket notation is used to denote the type of the item that goes in the collection. These are refered to as generics, and allow for
Collection
s to contain any type of object. Discuss with your partner why this might be useful. (Hint: Think about ourIntList
from before. What would we have to do if we wanted to make a similar list that storesString
s). For now at least, you should always use angle brackets for both the declaration and the instantiation. We'll see later in this lab and in this course that angle bracket notation can be used in situations other than using a Collection. - If you're creating a
Collection
of primitives (int
s,double
s,boolean
s, etc.) you should use the wrapper type (e.g.Integer
,Double
,Boolean
) instead of the primitive type. - The retrieval method of our
List
returns anInteger
, but thanks to Java's autoboxing functionality, the value we want is automatically unwrapped into anint
. For example, the following is valid Java codeint x = new Integer(5)
.
Also note that we declared s
to be a variable that points to a List
object, however the actual object we created was specifically an ArrayList
. Recall that List
refers to an interface, and so it is not actually possible to create a List
object. Instead, we had to choose a concrete class that actually implemented List
, in this example the Java ArrayList
class. Recall that a concrete class is a non-abstract class that implements all the methods it potentially inherits from any abstract superclasses or interfaces.
Since the LinkedList
class is supposed to represent essentially the same abstraction (a numbered list of items), it has almost the same API (Application Programming Interface) as an ArrayList
. For our purposes today, that means it supports almost the same methods. This makes it easy to change back and forth between an ArrayList
and a LinkedList
. The almost qualifiers have been added because LinkedList
also implements the Deque
interface which means it has a few more methods available than just a List
.
For our toy example, since LinkedList also has an add and get method, we could easily change our code to read:
List<Integer> s;
s = new LinkedList<Integer>();
s.add(5);
int x = s.get(0); /* gets the five back */
That is, only the instantiation of s
changes. It is nice that the Java designers gave us many implementations of the same collection, since each of these implementations has its own strengths and weaknesses. As we continue to discuss data structures, you will begin to see reasons why we might choose one implementation over another.
B. Iterators and Iterables
The Iterator Interface
An Iterator
is an interface which defines the functionality to traverse the values in a collection of objects (here we mean the abstract notion of a collection, not necessarily Collection
as defined in Java). In other words you can have an Iterator
which iterates over objects stored in something that is not a collection (you will see this later in this lab). An Iterator
will yield the elements one at a time through the Iterator
API. This interface is rather succint and contains the following four methods. Note: In this lab and the rest of this class, we will almost exclusively use only the first two methods below.
boolean hasNext()
: Returns true if the iteration has more elements. (In other words, returns true if next() would return an element rather than throwing an exception.)E next()
: Returns the next element in the iteration. It's worth noting that it is also typically used with a call tohasNext()
beforehand, thus ensuring that theIterator
does indeed have a next value to return. If there are no more elements to remaining, it is common practice to throw aNoSuchElementException
.default void remove()
: Because this is declared as a default method in the interface, it need not be overriden and as such it is rarely implemented in this class and elsewhere. The default implementation throws anUnsupportedOperationException
and performs no other action, thus this is an optional operation. If implemented the method removes from the underlying collection the last element returned by this iterator. This method can be called only once per call to next(). The behavior of an iterator is unspecified if the underlying collection is modified while the iteration is in progress in any way other than by calling this method.default void forEachRemaining(Consumer<? super E> action)
: Also declared asdefault
and thus does not need to be implemented. This will never be overridden forIterator
s you will be asked to make. Performs the given action for each remaining element until all elements have been processed or the action throws an exception. Actions are performed in the order of iteration, if that order is specified. Exceptions thrown by the action are relayed to the caller.
While in principle a collection could itself implement the iterator interface, this might result in very strange code. We can see that this is true by considering the interfaces abstractly. A collection
itself is not an iterator, rather an Iterator
is some object that iterates over a collection! Collections that wish to support iteration typically provide an iterator()
method that returns an appropriate Iterator
object. This can be used to iterate over the collection (this is a part of the Iterable
interface that will be described in more detail below). Consequently, the collection will now have an iterator
that is accessible by calling iterator()
.
For example, if L
is a List<String>
, you could write
for (Iterator<String> i = L.iterator(); i.hasNext();) {
String value = i.next();
System.out.print(value + " ");
}
This code would print out each item in the list, one at a time. We could have instead of written the code using a while
loop as follows:
Iterator<String> i = L.iterator();
while (i.hasNext()) {
String value = i.next();
System.out.print(value + " ");
}
The Iterable Interface
Iterating through interfaces using next
and hasNext
would be tedious to write every time (as we have written the code in the above two loops), and thus Java has introduced a special syntax for this Java idiom known as the Iterable
interface. If L has a type that implements Iterable<String>
(as List<String>
does), then the loops above may be written using a loop called an enhanced for loop. This will be more similar to the loops that you might have seen before in Python. These loops are also known as for each loops. The example below could be read in English as "for each String value in L, print value."
for (String value : L) {
System.out.print(value + " ");
}
The Iterable
interface is really simple. A class that implements it must simply provide an iterator method that returns an Iterator
.
package java.lang;
public interface Iterable<Value> {
/** Returns an iterator over my values. */
Iterator<Value> iterator();
}
Iterators and Iterables Example
Here's an example of how Iterable
and Iterator
are used in practice. Suppose we had a class FixedSizeList
that represents a list that stores its values in an integer array. Remember that we ask the FixedSizeList for an Iterator
instance through the iterator()
method. This then returns a FixedSizeListIterator
, which is an inner class we've defined to iterate through our FixedSizeList
.
import java.util.Iterator;
public class FixedSizeList implements List<Integer>, Iterable<Integer> {
// instance variables
protected int[] values; // list elements
int count; // number of array cells used by list
// private Iterator class
private class FixedSizeListIterator implements Iterator<Integer> {
int nextIndexToReturn = 0; // index of next item to return by iterator
public Integer next() {
int valToReturn = values[nextIndexToReturn];
nextIndexToReturn++;
return valToReturn;
}
public boolean hasNext() {
return nextIndexToReturn < count;
}
}
public Iterator<Integer> iterator() {
return new FixedSizedListIterator();
}
...
}
We could build a FixedSizeList
as follows.
public static void main(String [] args) {
FixedSizeList list = new FixedSizeList(10);
list.add(5);
list.add(3);
// list now contains the integers 5, 3;
// thus values[0] is 5, values[1] is 3,
// and count is 2.
...
}
Then we can use an iterator.
Iterator<Integer> iter = list.iterator();
int m;
m = iter.next();
// m now contains 5 and nextIndexToReturn contains 1
m = iter.next();
// m now contains 3 and nextIndexToReturn contains 2
At this point, another call to next
would be invalid since nextIndexToReturn
is past the end of the list values. The code maintains an important invariant: prior to any call to next
, nextIndexToReturn
contains the index of the next value in the list to return.
Finally, the values in a FixedSizeList
can be iterated over using an enhanced for loop like so:
for (Integer i : list) {
...
}
C. Practice with Object Oriented Programming
Introduction to Table Exercise
For the remainder of the lab we will be dealing with a simple database system to give you practice with Object Oriented Programming and some of the above interfaces we discussed (mainly Iterator
). Before beginning to write any code, it will be helpful to understand the structure of the classes that we have provided for you, so that you will be able to work with it. You are not required to read or understand all of the implementations of the methods, but it may be helpful or interesting to look at some of them. Instead you should be able to rely on the descriptions of the classes here or the comments in the code to explain how other methods work.
Table
The main class that represents a database is Table.java
. A Table
represents a single database table, which can be read in from file in the format of a csv (comma separated values file). The operations are somewhat limited, but the functionality supported includes getting rows from the Table
by index, iterating through the rows of a table, filtering rows from a Table
using a TableFilter
, and creating Table
s from the cartesian product or cross join of two input Table
s. The methods and constructors contained within the Table
class are:
private Table()
: Initialize a Table without a header or any rows. For internal use only.public Table(String file)
: Initialize aTable
from a file.private void initColumnMap(String headerRow)
/private void initColumnMap(List<String> headerList)
: Initialize a mapping from column name to column index for aTable
.private void addRow(String dataRow)
/private void addRow(TableRow row)
: Add a row to aTable
. Errors if the data is not the correct size.private String headerRow()
: Returns a string representation of aTable
's header.public int colNameToIndex(String colName)
: Returns theint
corresponding to the column named colName.public int numColumns()
: Returns the number of columns.public List<String> headerList()
: Returns the list of columns in aTable
, in the correct order.public TableRow getRow(int i)
: Returns thei
th row of aTable
.public static Table join(Table t1, Table t2)
: Returns the result of doing a cross join on two tables.public static Table filter(TableFilter filter)
: Returns the result of doing a filtering a table using filter.
Table.TableRow
The inner class of TableRow
represents a single row of data. The implementation of Table
is essentially just a List
of TableRow
s. Similarly a TableRow
is essentially just a List
of stings. The following methods and constructors are contained within the TableRow
class:
public TableRow(List<String> data)
: Initialize aTableRow
from the given data.public String getValue(int i)
: Returns thei
th value in aTableRow
.public static TableRow joinRows(TableRow tr1, TableRow tr2)
: Returns theTableRow
which is the result of joining twoTableRows
.public int size()
: Return the size of aTableRow
.
TableFilter
A useful operation may be to filter out rows which do not match some criteria. Filtering can be done over multiple different criteria, but each of these filters can be implemented in the same way (besides the code that specifies which TableRows
should be kept). In order to achieve this shared functionality, we have created the TableFilter
class, which is an abstract class that allows for filtered iteration through a Table
. This abstract class implements Iterator<Table.TableRow>
which means that it specifies a hasNext
and a next
method that allow for the filtered iteration.
What differentiates one implementation of TableFilter
from another is the abstract method keep()
which returns true if and only if the value of _next
stored in the TableFilter
should be delivered by the next()
method. By implementing just this method (as well as any necessary instance variables and a constructor), you can create a new TableFilter
. This is much nicer design as it puts all of the control logic shared into an abstract class while the specific functionality to a particular filter is contained solely in that class.
For this lab, we have provided the completed TableFilter.java
and IdentityFilter
which is an implementation which does not filter out any TableRows
. Your task will be to complete the following four other TableFilters
.
ColumnMatchFilter
: In construction two column names are passed in. This filter filters out anyTableRow
s where the data for each of these two columns does not match exactly.EqualityFilter
: In construction a column name and a string to match is passed in. This filter filters out anyTableRow
s where the data for the given column does not exactly match the given string.GreaterThanFilter
: In construction a column name and a string to compare to is passed in. This filter filters out anyTableRow
s where the data for the given column is not greater than the given string.SubstringFilter
: In construction a column name and a substring is passed in. This filter filters out anyTableRow
s where the data for the given column does not contain the given substring. Hint: theString
contains
method) will be helpful for thisTableFilter
.
The code for each of these four implementations will involve adding the necessary instance variables to the class, setting up the instance variables in the constructor, and implementing the keep()
method. Although the code for each will differ the general structure should be the same.
Table.JoinIterator
Another useful operation which can be combined with the previous filters for more interesting queries is a cross join. From two starting Table
s t1
and t2
, a cross join will output another table which contains the combination of each row from t1
to each row from t2
. This is illustrated with the following example. Suppose t1
is the table:
item,color
---------------
sky,blue
grass,green
and suppose t2
is the table:
item,is_alive
---------------
grass,yes
cat,yes
sky,no
Then the result of calling join(t1, t2)
would be the following table:
t1.item,t1.color,t2.item,t2.is_alive
---------------
sky,blue,grass,yes
sky,blue,cat,yes
sky,blue,sky,no
grass,green,grass,yes
grass,green,cat,yes
grass,green,sky,no
You could imagine we could combine this with a ColumnMatchFilter
on the columns t1.item
and t2.item
which would result in the following table:
t1.item,t1.color,t2.item,t2.is_alive
---------------
sky,blue,sky,no
grass,green,grass,yes
Your final task for this lab will be to complete the functionality of the inner class Table.JoinIterator
to make the join(Table t1, Table t2)
function work. We have provided most of the implementation of this class, besides the hasNext
method. One common pattern is to have the hasNext()
method handle most of the logic of advancing the iterator. The next()
method can then call the hasNext()
and return the value set through hasNext()
. We have followed that pattern here. To complete this you should first read through the provided code in Table.JoinIterator
so you can familiarize yourself with the setup, then begin coding the hasNext()
method. You should not need to make any changes to the class besides this method, but as long as you pass the tests you can alter the design however you see fit.
Hint: You will need to use the iterators from each of the tables to output all of the combinations of the row.
Hint: Use the joinRows
function in the inner Table.TableRow
class to help with the combination of TableRows
.
TestTable
We have provided you the complete set of autograder tests to aid you in testing and debugging your code that you write. With the skeleton code we have provided you should be failing five of the fourteen tests provided. In order to get full credit for the lab, you need to pass all fourteen of the tests, that is you must add the functionality we have specified in the above two parts without altering the existing structure of the Table
. The task should be mostly independent of the existing behavior, so this likely is not something you should worry about, but be mindful of this.
Finally the tests rely on the data inside the sample_db
folder. If you want to pass the tests locally you should not alter these files. If you do, you can always use Git to check them out. The autograder will run the tests with staff versions of these files, so even if you alter them you can still pass the tests on Gradescope.
D. Submission
In order to complete this lab you should have filled out the following:
ColumnMatchFilter
: Constructor, instance variables, andkeep()
method.EqualityFilter
: Constructor, instance variables, andkeep()
method.GreaterThanFilter
: Constructor, instance variables, andkeep()
method.SubstringFilter
: Constructor, instance variables, andkeep()
method.Table
: You should only have to fill in oneFIXME
for thehasNext()
method in theTable.JoinIterator
inner class.
Make sure that all of the tests in TestTable
are passing before submitting your code. As usual add your files, commit, tag the commit, and then push your tags to the central repository to submit your work.