Lab 5: OOP with Collections, Iterators, and Iterables

Due Date: Tuesday 2/22 11:59PM. Note that this lab has a special due date due to the test and Presidents' Day.

In this lab we will be giving you a small taste of the Java Standard Library as a means for understanding Object Oriented Programming. In particular, we will be looking at the Collection, Iterators, and Iterable interfaces and several classes which implement these. Collections encompass many of the data structures that you will be working with this semester (we will later talk about how they are implemented, but for now we will just use them). Iterators are objects that control iteration through items in a collection via two methods: hasNext (check if there are more items to be iterated over), and next (return the next item in the iteration). Iterators are provided for every class that implements Collection in java.util. More generally, they provide a mechanism for unifying the operations of processing an array in several formats, handling input, and traversing more complicated structures.

We will first review these interfaces and some of their subclasses, and then you will work with them in a few exercises. By the end of this lab, you should be comfortable with searching through the Java Standard Library documentation and with using Collection, Iterators, and Iterables.

A. Content Review

Please watch the following videos to review Collection, Iterators, and Iterable. Slides shown in the video can be found here.

B. Practice with Object Oriented Programming

Introduction to Table Exercise

For the remainder of the lab we will be dealing with a simple database system to give you practice with Object Oriented Programming and some of the above interfaces we discussed (mainly Iterator). Before beginning to write any code, it will be helpful to understand the structure of the classes that we have provided for you, so that you will be able to work with it. You are not required to read or understand all of the implementations of the methods, but it may be helpful or interesting to look at some of them. Instead you should be able to rely on the descriptions of the classes here or the comments in the code to explain how other methods work.

We highly recommend you to check out the following video walkthrough to better understand the table exercise and the lab skeleton code structure:

Table

The main class that represents a database is Table.java. A Table represents a single database table, which can be read in from file in the format of a csv (comma separated values file). The operations are somewhat limited, but the functionality supported includes getting rows from the Table by index, iterating through the rows of a table, filtering rows from a Table using a TableFilter, and creating Tables from the cartesian product or cross join of two input Tables. The methods and constructors contained within the Table class are:

• private Table(): Initialize a Table without a header or any rows. For internal use only.
• public Table(String file): Initialize a Table from a file.
• private void initColumnMap(String headerRow) / private void initColumnMap(List<String> headerList): Initialize a mapping from column name to column index for a Table.
• private void addRow(String dataRow) / private void addRow(TableRow row): Add a row to a Table. Errors if the data is not the correct size.
• private String headerRow(): Returns a string representation of a Table's header.
• public int colNameToIndex(String colName): Returns the int corresponding to the column named colName.
• public int numColumns(): Returns the number of columns.
• public List<String> headerList(): Returns the list of columns in a Table, in the correct order.
• public TableRow getRow(int i): Returns the ith row of a Table.
• public static Table join(Table t1, Table t2): Returns the result of doing a cross join on two tables.
• public static Table filter(TableFilter filter): Returns the result of doing a filtering a table using filter.

Table.TableRow

The inner class of TableRow represents a single row of data. The implementation of Table is essentially just a List of TableRows. Similarly a TableRow is essentially just a List of strings. The following methods and constructors are contained within the TableRow class:

• public TableRow(List<String> data): Initialize a TableRow from the given data.
• public String getValue(int i): Returns the ith value in a TableRow.
• public static TableRow joinRows(TableRow tr1, TableRow tr2): Returns the TableRow which is the result of joining two TableRows.
• public int size(): Return the size of a TableRow.

TableFilter

A useful operation may be to filter out rows which do not match some criteria. Filtering can be done over multiple different criteria, but each of these filters can be implemented in the same way (besides the code that specifies which TableRows should be kept). In order to achieve this shared functionality, we have created the TableFilter class, which is an abstract class that allows for filtered iteration through a Table. This abstract class implements Iterator<Table.TableRow> which means that it specifies a hasNext and a next method that allow for the filtered iteration.

What differentiates one implementation of TableFilter from another is the abstract method keep() which returns true if and only if the value of _next stored in the TableFilter should be delivered by the next() method. By implementing just this method (as well as any necessary instance variables and a constructor), you can create a new TableFilter. This is much nicer design as it puts all of the control logic shared into an abstract class while the specific functionality to a particular filter is contained solely in that class.

For this lab, we have provided the completed TableFilter.java and IdentityFilter which is an implementation which does not filter out any TableRows. Your task will be to complete the following four other TableFilters.

• ColumnMatchFilter: In construction two column names are passed in. This filter filters out any TableRows where the data for each of these two columns does not match exactly.
• EqualityFilter: In construction a column name and a string to match is passed in. This filter filters out any TableRows where the data for the given column does not exactly match the given string.
• GreaterThanFilter: In construction a column name and a string to compare to is passed in. This filter filters out any TableRows where the data for the given column is not greater than the given string.
• SubstringFilter: In construction a column name and a substring is passed in. This filter filters out any TableRows where the data for the given column does not contain the given substring. Hint: the String contains method) will be helpful for this TableFilter.

The code for each of these four implementations will involve adding the necessary instance variables to the class, setting up the instance variables in the constructor, and implementing the keep() method. Although the code for each will differ the general structure should be the same.

Table.JoinIterator

Another useful operation which can be combined with the previous filters for more interesting queries is a cross join. From two starting Tables t1 and t2, a cross join will output another table which contains the combination of each row from t1 to each row from t2. This is illustrated with the following example. Suppose t1 is the table:

item,color
---------------
sky,blue
grass,green

and suppose t2 is the table:

item,is_alive
---------------
grass,yes
cat,yes
sky,no

Then the result of calling join(t1, t2) would be the following table:

t1.item,t1.color,t2.item,t2.is_alive
---------------
sky,blue,grass,yes
sky,blue,cat,yes
sky,blue,sky,no
grass,green,grass,yes
grass,green,cat,yes
grass,green,sky,no

You could imagine we could combine this with a ColumnMatchFilter on the columns t1.item and t2.item which would result in the following table:

t1.item,t1.color,t2.item,t2.is_alive
---------------
sky,blue,sky,no
grass,green,grass,yes

Your final task for this lab will be to complete the functionality of the inner class Table.JoinIterator to make the join(Table t1, Table t2) function work. We have provided most of the implementation of this class, besides the hasNext method. One common pattern is to have the hasNext() method handle most of the logic of advancing the iterator. The next() method can then call the hasNext() and return the value set through hasNext(). We have followed that pattern here. To complete this you should first read through the provided code in Table.JoinIterator so you can familiarize yourself with the setup, then begin coding the hasNext() method. You should not need to make any changes to the class besides this method, but as long as you pass the tests you can alter the design however you see fit.

Hint: You will need to use the iterators from each of the tables to output all of the combinations of the row.

Hint: Use the joinRows function in the inner Table.TableRow class to help with the combination of TableRows.

TestTable

We have provided you the complete set of autograder tests to aid you in testing and debugging your code that you write. With the skeleton code we have provided you should be failing five of the fourteen tests provided. In order to get full credit for the lab, you need to pass all fourteen of the tests, that is you must add the functionality we have specified in the above two parts without altering the existing structure of the Table. The task should be mostly independent of the existing behavior, so this likely is not something you should worry about, but be mindful of this.

Finally the tests rely on the data inside the sample_db folder. If you want to pass the tests locally you should not alter these files. If you do, you can always use Git to check them out. If you are unsure of how to checkout files to undo changes, feel free to refer to the Git Guide or ask any TA or AI for help. The autograder will run the tests with staff versions of these files, so even if you alter them you can still pass the tests on Gradescope.

C. Submission

In order to complete this lab you should have filled out the following:

• ColumnMatchFilter: Constructor, instance variables, and keep() method.
• EqualityFilter: Constructor, instance variables, and keep() method.
• GreaterThanFilter: Constructor, instance variables, and keep() method.
• SubstringFilter: Constructor, instance variables, and keep() method.
• Table: You should only have to fill in one FIXME for the hasNext() method in the Table.JoinIterator inner class.

Make sure that all of the tests in TestTable are passing before submitting your code. As usual add your files, commit, tag the commit, and then push your tags to the central repository to submit your work.