Lab 9: Heaps and Priority Queues

A. Intro

Today, we'll take a look at the priority queue and how it can be implemented using a heap. We'll then implement our own priority queue (called ArrayHeap), and then discuss some applications of heaps.

B. Implementations of queues and priority queues

We’ve learned about a few abstract data types already, including the stack and queue. The stack is a last-in-first-out (LIFO) abstract data type where, much like a physical stack, one can only access the top of the stack and must pop off recently added elements to retrieve previously added elements. The queue is a first-in-first-out (FIFO) abstract data type. When we process items in a queue, we process the most recently added elements last.

But what if we want to model an emergency room, where people waiting with the most urgent conditions are served first? We would need to serve patients regardless of when they arrive, since those arriving first or most recently will not necessarily be the ones most in pain.

Sometimes, processing items LIFO or FIFO is not what we want. We may instead want to process items in order of importance or a priority value.

The priority queue is an abstract data type that contains the following methods:

It is similar to a Queue though the insert method will insert an item with a corresponding “priority value” and the poll method in the priority queue will remove the element with the highest priority, rather than the oldest element in the queue.

Note: The element with the highest priority may not always have the highest priority value. Priorities are a function of priority values: if we have a maximum priority queue, the elements with higher priority will be the ones with larger priority values; on the other hand, if we have a minimum priority queue, the elements with higher priority will be the ones with smaller priority values.

This seems like a cool thing to have, but how do we actually implement something like this? Java’s priority queue is actually implemented with a data structure called a binary min heap. For the remainder of this lab, we will study the heap data structure and create our own implementation of a priority queue using a binary min heap.

Exercise: Implementing your own min heap

The class ArrayHeap implements a binary min heap using an ArrayList, just like we discussed at the beginning of this lab. Open it up and read the provided methods.

Notice in the constructor we call contents.add(null). Think about how this affects indexing and why we choose to add a null element.

Fill in the incomplete methods in ArrayHeap.java (marked with TODOs) You should not edit the methods without TODOs but you can use them in the code you write. As John DeNero wisely says, code that doesn't respect abstraction barriers should BURN. Respect abstraction barriers! (You should able to finish the lab without directly accessing the contents ArrayList.)

We have provided a very basic test located in ArrayHeapTest.java. This test is by no means comprehensive, so we highly recommend adding your own tests.

C. Heapsort

Now, let’s move onto an application of the heap data structure. Suppose you have an array of $N$ numbers that you want to sort smallest-to-largest. One algorithm for doing this is as follows:

  1. Put all of the numbers in a min heap.
  2. Repeatedly remove the min element from the heap, and store them in an array in that order.

This is called heapsort.

Now, what is the runtime of this sort? Since each insertion takes proportional to $\log N$ comparisons once the heap gets large enough and each removal also takes proportional to $\log N$ comparisons, the whole process takes proportional to $N \log N$ comparisons.

It turns out we can actually make step 1 of heapsort run faster—proportional to $N$ comparisons—using a process called heapify. (Unfortunately, we can’t make step 2 run any faster than $N \log N$, so the overall heapsort must take $N \log N$ time.)

D. Heapify

The algorithm for taking an arbitrary array and making it into a min (or max) heap in time proportional to $N$ is called heapify. Pseudocode for this algorithm is below:

def heapify(array):
    index = N / 2
    while index > 0:
        bubble down item at index
        index -= 1

Conceptually, you can think of this as building a heap from the bottom up. To get a visualization of this algorithm working, click on the BuildHeap button on USFCA interactive animation of a min heap. This loads a pre-set array and then runs heapify on it.

Try to describe the approach in your own words. Why does the index start at the middle of the array rather than the beginning, 0, or the end, N? How does each bubble down operation maintain heap invariants?

It is probably not immediately clear to you why this heapify runs in $O(N)$. For those who are curious, you can check out an explanation on StackOverflow or Wikipedia.

C. Submission

Please submit ArrayHeap.java. The grader for this assignment will be the same as the test that we have provided for you. However if you pass this test you might still have a buggy implementation. To verify the correctness of your ArrayHeap code you should write more tests to fully test the behavior of your implementation. This also has the benefit of reinforcing your knowledge of how priortiy queue / array heap operations work as you will have to determine what the expected behavior is. If you want extra help thinking of tests feel free to reach out to a TA or academic intern for assistance!