This homework is intended to give you a chance to better understand sorting and balanced search trees. Here is the introduction video.
- A. Implementing Sorting Algorithms
- B. Sorting Problems
- C. Quicksort Mechanics (Optional)
- D. Submission
A. Implementing Sorting Algorithms
provides a template for implementing various sorting algorithms.
SortingAlgorithm.java to see the interface you will have to satisfy. Note that every child class's sort method takes in an argument k. The sorting algorithm should sort the array from index 0 to k.
Hint: This argument could be useful for some of your sorts.
Implement the algorithms below (the ones with an asterisk next to it
are required for this assignment). Note that an implementation of
Quicksort.java has been provided. Clicking on the links will take you to
an animation of the sort to refresh your memory.
Watching these videos is not mandatory, but they are linked to help your conceptual understanding for this assignment! If you don't like videos, these slides also have step-by-step breakdowns for each algorithm.
- *Selection sort: Easier
- *Insertion sort: Easier
- *Mergesort: Moderate
- Counting sort: Moderate
- Heapsort: Harder
- Quicksort: Harder
- *LSD radix sort: Harder
- MSD radix sort: Harder
- Quick review: Insertion Sort, Selection Sort, Mergesort, Quicksort, and Heapsort here
- Quick review: LSD, MSD Radix sort and more Quicksort here
- Insertion sort
- Selection sort
- LSD and Counting Sort
You can use
MySortingAlgorithmsTest.java to test your implementations.
Once you're done, here are some questions to ponder (you don't need to record nor submit your responses):
- How many items are sorted in the video for selection sort?
- Why does insertion sort take longer/more compares than selection sort in this example?
- At what time stamp does the first partition complete for Quicksort?
- Could the size of the input used by mergesort in the video be a power of 2?
- What do the colors mean for heapsort?
- How many characters are in the alphabet used for the LSD sort problem?
- How many digits are in the keys used for the LSD sort problem?
- The complete video ends with a rather odd sort that doesn't complete. Without looking at the captions, can you tell what it's doing?
RunBenchmarks class runs timing tests on our
various sorts. Currently, it is configured to perform two timing
- Compare (y)our implementation against Arrays.sort (which uses
- Time insertion sort for almost sorted arrays.
Run the code, and you should see results that are in line with things we've learned in class.
If the second test takes too long to complete, you can "Edit Configurations" in IntelliJ and type a number for program arguments to run the test on a smaller input size.
Other interesting tests you might try:
- How do LSD and MSD compare with Quicksort and Mergesort? How large must N be before they become faster?
- Find a case where LSD is faster than MSD.
- For what N is insertion sort faster than Quicksort?
- How do selection sort and insertion sort compare?
Feel free to post interesting tests and/or observations on Piazza. You do not need to submit anything related to this Benchmark test.
B. Sorting Problems
There are many problems for which sorting provides a fast solution, even though the problem isn't really about sorting. We encourage you to do all of these, but you only need to do any one of them for full credit. All of the questions below are really common coding interview questions, if that's a motivating factor for you!
Define an interval to be a pair of numbers $[x_i, x_j]$ such that $x_i < x_j$. An interval specifies a range of values in 1D space, where we can call $x_i$ the start point, and $x_j$ the end point.
Given a list of such intervals, we want to know the total length of the regions covered by one or more of the intervals. This is not simply the sum of their lengths, $\Sigma x_j - x_i$, since several may cover the same span.
For example, if we have intervals $[19, 30]$, $[6, 12]$, $[4, 5]$, $[8, 15]$, and $[3, 10]$, then the total length covered is 23: the last four intervals together totally cover the interval $[3, 15]$ of length 12, and the first covers a disjoint interval of length 11.
Intervals.java so that the
coveredLength method returns the
correct total length in $\Theta(N log N)$ time.
There is a clever trick to this problem that you'll need to figure out to get $\Theta(N log N)$ time. You do not need to use any nested loops for this problem, and in fact if you find yourself using them, you probably haven't found the right approach.
Note that you may want to use
Lists.sort(Comparator<? super E> c) or
Collections.sort(List<T> list, Comparator<? super T> c). for this problem!
If you find yourself using either of these, remember that you can define an
inner class that implements the Comparator interface.
Distribution Count for Large Numbers (SortInts.java)
[Goodrich & Tamassia] Given a sequence of n distinct integers, each one of which is in the range $[0, n^2 - 1]$, develop an $O(n)$ algorithm for sorting them. See the skeleton file SortInts.java. You can't use ordinary distribution sort for this, because that would require initializing and traversing arrays of size $n^2$, which would take too long
Inversion Counting (Inversions.java)
Find an algorithm that runs in $O(n \lg n)$ time for computing the number of inversions in a list of n items. Array elements that are "out of order" can be corrected by swapping two adjacent elements at a time, each of which counts as a single inversion. See the skeleton file Inversions.java. To test that your code is actually $O(n \lg n)$, provide it a very large list (say hundreds of thousands).
Two Sum (Sum.java)
[Goodrich & Tamassia] Given two sequences of integers, $A$
and $B$, find an algorithm that runs in $O(n \lg n)$
time (where n is the total number of integers in A and B) that
determines, for a given parameter $m$, whether there is an
integer $a$ in $A$ and an integer $b$ in
$B$ such that $m = a + b$. See the skeleton file
Sum.java. To test that your algorithm runs in $O(n \lg n)$
time, provide sequences of hundreds of thousands of integers. Feel
free to use any of the methods in java.util.Arrays.
C. Quicksort Mechanics (Optional)
Interestingly enough, quicksorting an array is equivalent to inserting all of its items into a BST. In this problem, we'll see why.
First, we need a specific implementation for partitioning.
Consider the array
[5, 3, 2, 1, 7, 8, 4, 6], and suppose that
we pick the leftmost item 5 as our pivot. One approach to partitioning
is to perform a "stable" partitioning where all items that are less
than 5 appear in the same order as they did before partitioning, and
likewise for the greater items. For the given array, we'd get
2, 1, 4, 5, 7, 8, 6].
One inefficient but simple way to implement this stable partitioning algorithm is to perform the following steps:
- Create three empty Lists for storing integers smaller, equal to, and larger than the pivot, respectively.
- Go through each item of the list, comparing it to the pivot, and adding items to the respective list based on the comparison.
- Concatenting the three Lists into a single concatenated List.
- Copying the concatenated List back into the array.
So for example if we partition
[5, 3, 2, 1, 7, 8, 4, 6] from index 0
to index 7, we'd get:
Smaller items: [3, 2, 1, 4] Equal items :  Larger items : [7, 8, 6]
The concatenation of these lists is just
[3, 2, 1, 4, 5, 7, 8, 6].
Along the way, we compared the following pairs of numbers:
5-1, 5-4, 5-7, 5-8, and
If we are using partition to sort, we then repeat this process for the left half and right half sides as discussed in class.
- Try to write out the rest of the list of comparisons used by
Quicksort. You might find running the provided
Quicksort.javato be useful. The order of the pair does not matter (5-3 is equivalent to 3-5). You may also want to draw the BST that results when you insert
[5, 3, 2, 1, 7, 8, 4, 6](in that order) into an initially empty BST.
- List all comparisons performed when inserting [5, 3, 2, 1, 7, 8, 4, 6] into a BST. Give your comparisons in the order they occur, writing the pivot first. Put only a comma and a space between pairs. The first few have been provided for you, so you should not include them in your answer: 5-3, 5-2, 3-2, ...
- What do you observe about the list of comparisons done by Quicksort vs BST insertion?
You can highlight (drag your mouse over) the following whited-out text to see the answers:
- In addition to the comparisons with 5 that we saw earlier, we also do all of 3-2, 3-1, 3-4, 2-1, 7-8, 7-6.
- 5-1, 3-1, 2-1, 5-7, 5-8, 7-8, 5-4, 3-4, 5-6, 7-6
- They have the same comparisons, though they're done in a different order!
You will be required to submit:
MySortingAlgorithms.javawith Selection Sort, Insertion Sort, Mergesort, and LSD Radix Sort implemented.
- At least one of
Sum.java. (We suggest you attempt all of them, however.)
Don't forget to push both your commits and tags for your final submission. As a reminder, you can push your tags by running:
$ git push --tags