Lab 9 | CS 61B Fall 2018

Lab 9: Heaps and Priority Queues

A. Intro

Today, we'll take a look at the priority queue and how it can be implemented using a heap. We'll then implement our own priority queue (called ArrayHeap), and then discuss some applications of heaps.

B. Implementations of queues and priority queues

We’ve learned about a few abstract data types already, including the stack and queue. The stack is a last-in-first-out (LIFO) abstract data type where, much like a physical stack, one can only access the top of the stack and must pop off recently added elements to retrieve previously added elements. The queue is a first-in-first-out (FIFO) abstract data type. When we process items in a queue, we process the most recently added elements last.

But what if we want to model an emergency room, where people waiting with the most urgent conditions are served first? We would need to serve patients regardless of when they arrive, since those arriving first or most recently will not necessarily be the ones most in pain.

Sometimes, processing items LIFO or FIFO is not what we want. We may instead want to process items in order of importance or a priority value.

The priority queue is an abstract data type that contains the following methods:

insert(item, priority): Inserts item into the priority queue with priority value priority.
poll(): Removes and returns the highest priority item in the priority queue.
peek(): Returns the highest priority item.

It is similar to a Queue though the insert method will insert an item with a corresponding “priority value” and the poll method in the priority queue will remove the element with the highest priority, rather than the oldest element in the queue.

Note: The element with the highest priority may not always have the highest priority value. Priorities are a function of priority values: if we have a maximum priority queue, the elements with higher priority will be the ones with larger priority values; on the other hand, if we have a minimum priority queue, the elements with higher priority will be the ones with smaller priority values.

This seems like a cool thing to have, but how do we actually implement something like this? Java’s priority queue is actually implemented with a data structure called a binary min heap. For the remainder of this lab, we will study the heap data structure and create our own implementation of a priority queue using a binary min heap.

Exercise: Implementing your own min heap

The class ArrayHeap implements a binary min heap using an ArrayList, just like we discussed at the beginning of this lab. Open it up and read the provided methods.

Notice in the constructor we call contents.add(null). Think about how this affects indexing and why we choose to add a null element.

Fill in the missing methods in ArrayHeap.java. As John DeNero wisely says, code that doesn't respect abstraction barriers should BURN. Respect abstraction barriers! (You should able to finish the lab without directly accessing the contents ArrayList.)

First implement getLeftOf(int i), getRightOf(int i), getParentOf(int i).
Next, using the previously implemented methods, implement setLeft(int index, Node n) and setRight(int index, Node n)
Next, implement min(int index1, int index2), peek(), bubbleUp(int index), bubbleDown(int index)
Lastly, implement insert(T item, double priority), removeMin(), changePriority(T item, double priority). Make sure you use the bubbleUp and bubbleDown methods when implementing these functions.

We have provided a very basic test located in ArrayHeapTest.java. This test is by no means comprehensive, so we highly recommend adding your own tests.

C. Heapsort

Now, let’s move onto an application of the heap data structure. Suppose you have an array of $N$ numbers that you want to sort smallest-to-largest. One algorithm for doing this is as follows:

Put all of the numbers in a min heap.
Repeatedly remove the min element from the heap, and store them in an array in that order.

This is called heapsort.

Now, what is the runtime of this sort? Since each insertion takes proportional to $\log N$ comparisons once the heap gets large enough and each removal also takes proportional to $\log N$ comparisons, the whole process takes proportional to $N \log N$ comparisons.

It turns out we can actually make step 1 of heapsort run faster—proportional to $N$ comparisons—using a process called heapify. (Unfortunately, we can’t make step 2 run any faster than $N \log N$, so the overall heapsort must take $N \log N$ time.)

D. Heapify

The algorithm for taking an arbitrary array and making it into a min (or max) heap in time proportional to $N$ is called heapify. Pseudocode for this algorithm is below:

def heapify(array):
    index = N / 2
    while index > 0:
        bubble down item at index
        index -= 1

Conceptually, you can think of this as building a heap from the bottom up. To get a visualization of this algorithm working, click on the BuildHeap button on USFCA interactive animation of a min heap. This loads a pre-set array and then runs heapify on it.

Try to describe the approach in your own words. Why does the index start at the middle of the array rather than the beginning, 0, or the end, N? How does each bubble down operation maintain heap invariants?

It is probably not immediately clear to you why this heapify runs in $O(N)$. For those who are curious, you can check out an explanation on Wikipedia.

C. Submission