Today, we'll take a look at the priority queue and how it can be implemented
using a heap. We'll then implement our own priority queue (called
and then discuss some applications of heaps.
B. Implementations of queues and priority queues
We’ve learned about a few abstract data types already, including the stack and queue. The stack is a last-in-first-out (LIFO) abstract data type where, much like a physical stack, one can only access the top of the stack and must pop off recently added elements to retrieve previously added elements. The queue is a first-in-first-out (FIFO) abstract data type. When we process items in a queue, we process the most recently added elements last.
But what if we want to model an emergency room, where people waiting with the most urgent conditions are served first? We would need to serve patients regardless of when they arrive, since those arriving first or most recently will not necessarily be the ones most in pain.
Sometimes, processing items LIFO or FIFO is not what we want. We may instead want to process items in order of importance or a priority value.
The priority queue is an abstract data type that contains the following methods:
insert(item, priority): Inserts item into the priority queue with priority value priority.
poll(): Removes and returns the highest priority item in the priority queue.
peek(): Returns the highest priority item.
It is similar to a
Queue though the insert method will insert an item with a
corresponding “priority value” and the poll method in the priority queue will
remove the element with the highest priority, rather than the oldest element in
Note: The element with the highest priority may not always have the highest priority value. Priorities are a function of priority values: if we have a maximum priority queue, the elements with higher priority will be the ones with larger priority values; on the other hand, if we have a minimum priority queue, the elements with higher priority will be the ones with smaller priority values.
This seems like a cool thing to have, but how do we actually implement something like this? Java’s priority queue is actually implemented with a data structure called a binary min heap. For the remainder of this lab, we will study the heap data structure and create our own implementation of a priority queue using a binary min heap.
Exercise: Implementing your own min heap
ArrayHeap implements a binary min heap using an
like we discussed at the beginning of this lab. Open it up and read the provided
Notice in the constructor we call
contents.add(null). Think about how this
affects indexing and why we choose to add a
Fill in the missing methods in
ArrayHeap.java. As John DeNero wisely says,
code that doesn't respect abstraction barriers should BURN. Respect abstraction
barriers! (You should able to finish the lab without directly accessing the
- First implement
- Next, using the previously implemented methods, implement
setLeft(int index, Node n)and
setRight(int index, Node n)
- Next, implement
min(int index1, int index2),
- Lastly, implement
insert(T item, double priority),
changePriority(T item, double priority). Make sure you use the
bubbleDownmethods when implementing these functions.
We have provided a very basic test located in
ArrayHeapTest.java. This test is
by no means comprehensive, so we highly recommend adding your own tests.
Now, let’s move onto an application of the heap data structure. Suppose you have an array of $N$ numbers that you want to sort smallest-to-largest. One algorithm for doing this is as follows:
- Put all of the numbers in a min heap.
- Repeatedly remove the min element from the heap, and store them in an array in that order.
This is called heapsort.
Now, what is the runtime of this sort? Since each insertion takes proportional to $\log N$ comparisons once the heap gets large enough and each removal also takes proportional to $\log N$ comparisons, the whole process takes proportional to $N \log N$ comparisons.
It turns out we can actually make step 1 of heapsort run faster—proportional to $N$ comparisons—using a process called heapify. (Unfortunately, we can’t make step 2 run any faster than $N \log N$, so the overall heapsort must take $N \log N$ time.)
The algorithm for taking an arbitrary array and making it into a min (or max) heap in time proportional to $N$ is called heapify. Pseudocode for this algorithm is below:
def heapify(array): index = N / 2 while index > 0: bubble down item at index index -= 1
Conceptually, you can think of this as building a heap from the bottom up. To get a visualization of this algorithm working, click on the BuildHeap button on USFCA interactive animation of a min heap. This loads a pre-set array and then runs heapify on it.
Try to describe the approach in your own words. Why does the index start at the middle of the array rather than the beginning, 0, or the end, N? How does each bubble down operation maintain heap invariants?
It is probably not immediately clear to you why this heapify runs in $O(N)$. For those who are curious, you can check out an explanation on Wikipedia.