StringSet

Due Saturday, October 25th at 11:59 PM

Table of Contents

Introduction

For this assignment, you'll create a BST based implementation of the StringSet interface, which provides an abstraction for sets that contain Strings. To keep things simple, your set will only allow put and contains operations. You'll then compare the performance of your implementation to a reference implementation based on linked lists as well as the built-in Java BST set.

To implement the interface, you'll be working completely from scratch, with no Makefiles or skeleton code.

BSTStringSet

Create a class BSTStringSet that implements the StringSet interface using a BST as its core data structure. As references, you might find the following resources useful:

Clarification: Your implementation should support the ability to create an initially empty StringSet with a no-argument BSTStringSet constructor, e.g. StringSet s = new BSTStringSet(). While not required by the interface, you'll need this for the speed tester (in part 2) to work (really I should have had you extend an AbstractStringSet, but c'est la vie). Look closely at LinkedListStringSet to see how this class achieves the ability to be instantiated as an initially empty set.

While not required, it is recommended that you create a Makefile. If you're not sure where to start, the Makefile tutorial from lab 5 might be helpful, as well as the solutions to lab5.

Make sure to write tests to ensure that your class works correctly.

Optionally: Add a method printInOrder() that prints out your BSTStringSet in order.

Speed Testing

The InsertRandomSpeedTest class performs speed testing on LinkedListStringSet, your BSTStringSet, and Java's built-in TreeSet class. It works by asking the user for an input size N, then generates N strings of length 10 and inserts them into the set.

Try it out and see how your data structure scales with N compared to the naive and industrial strength implementations. Record your results in speedTest.txt. There is no standard format required for your results, and there is no required number of data points.

Create a new test InsertInOrderSpeedTest. Instead of inserting random strings, it should insert strings in increasing lexicographic order. For example, it might insert "cow", "cox", "coy", "coz", "cpa", "cpb", .... You'll want to use the NextString method provided in StringUtils. I'd recommend starting by just copying and pasting the code from InsertRandomSpeedTest and then making changes where necessary.

Now try running InsertInOrderSpeedTest. Again record your results in speedTests.txt. If you observed anthing interesting (hopefully you did), then you should explain this interesting thing.

Part 3: Generic Set (optional)

Create an implementation BSTGenericSet that implements the GenericSet interface.

Part 4: Dictionary (optional)

Create an implementation BSTDictionary that implements the SimpleDictionary interface.