HeapSort

Zhao Cong

Heapsort. Like merge sort, but unlike insertion sort, heapsort’s running time is . Like insertion sort, but unlike merge sort, heapsort sorts in place: only a constant number of array elements are stored outside the input array at any time.

Not only is the heap data structure useful for heapsort, but it also makes an efficient priority queue

Heaps

  • The (binary) heap data structure is an array object that we can view as a nearly complete binary tree
  • An array A that represents a heap is an object with two attributes:
    • A:length, which (as usual) gives the number of elements in the array, and
    • A:heap-size, which represents how many elements in the heap are stored within array A.
  • The root of the tree is , and given the index of a node, we can easily compute the indices of its parent, left child, and right child:
    • PARENT(i):
    • LEFT(i):
    • RIGHT(i):
  • Good implementations of heapsort often implement these procedures as “macros” or “inline” procedures.≠
  • There are two kinds of binary heaps: max-heaps and min-heaps
  • max-heap, the max-heap property:
  • min-heap, the min-heap property:
  • For the heapsort algorithm, we use max-heaps. Min-heaps commonly implement priority queues
  • Viewing a heap as a tree, we define the height of a node in a heap to be the number of edges on the longest simple downward path from the node to a leaf, and we define the height of the heap to be the height of its root.
  • Since a heap of elements is based on a complete binary tree, its height is

Maintaining the heap property

  • MAX-HEAPIFY lets the value at float down” in the max-heap so that the subtree rooted at index i obeys the max-heap property.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    MAX-HEAPIFY (A, i)
    1 l = LEFT(i)
    2 r = RIGHT(i)
    3 if l ≤ A.heap-size and A[l] > A[i]
    4 largest = l
    5 else largest = i
    6 if r ≤ A.heap-size and A[r] > A[largest]
    7 largest =r
    8 if largest ≠ i
    9 exchange A[i] with A[largest]
    10 MAX-HEAPIFY (A, Largest)
  • Consequently, we call MAX-HEAPIFY recursively on that subtree.
  • The solution to this recurrence, by case 2 of the master theorem , is Alternatively, we can characterize the running time of MAX-HEAPIFY on a node of height h as

Building a heap

1
2
3
4
BUILD-MAX-HEAP(A)
1 A.heap-size = A.length
2 for i = [A.length/2] downto 1
3 MAX-HEAPIFY(A,i)

The time required by MAX-HEAPIFY when called on a node of height h is , and so we can express the total cost of BUILD-MAX-HEAP as being bounded from above by We evaluate the last summation by substituting in the formula (A.8), yielding Thus, we can bound the running time of BUILD-MAX-HEAP as Hence, we can build a max-heap from an unordered array in linear time.

The heapsort algorithm

1
2
3
4
5
6
HEAPSORT(A) 
1 BUILD-MAX-HEAP(A)
2 for i = A.length downto 2
3 exchange A[1] with A[i]
4 A.heap-size = A.heap-size - 1
5 MAX-HEAPIFY (A, 1)

The HEAPSORT procedure takes time

Priority queues

  • As with heaps, priority queues come in two forms: max-priority queues and min-priority queues
  • A priority queue is a data structure for maintaining a set S of elements, each with an associated value called a key.
  • a max-priority queue supports the operations INSERT, MAXIMUM, EXTRACT-MAX, and INCREASE-KEY
  • a min-priority queue supports the operations INSERT, MINIMUM, EXTRACT-MIN, and DECREASE-KEY
1
2
HEAP-MAXIMUM (A) 
return A[1]
1
2
3
4
5
6
7
8
HEAP-EXTRACT-MAX (A) 
1 if A.heap-size < 1
2 error "heap underflow"
3 max = A[1]
4 A[1] = A[A.heap-size]
5 A.heap-size = A.heap-size -1
6 MAX-HEAPIFY (A,1)
7 return max

1
2
3
4
5
6
7
HEAP-INCREASE-KEY (A, i,key) 
1 if key < A[i]
2 error “new key is smaller than current key”
3 A[i]= key
4 while i > 1 and A[PARENT(i)]< A[i]
5 exchange A[i] with A[PARENT(i)]
6 i = PARENT(i)
1
2
3
4
MAX-HEAP-INSERT(A, key) 
1 A.heap-size = A.heap-size + 1
2 A[A.heap-size] = -∞
3 HEAP-INCREASE-KEY (A, A.heap-size,key)