What is a Heap?
A heap is a tree based data structure where each element satisfies a certain heap property. A binary heap (from now on will simply be referred as heap) is a specialized complete binary tree, a tree in which all levels are completely filled except possibly the last, and all the nodes in the last level are as much to the left as possible. And each node of a heap contains an attribute key, which defines the relative position of that node inside the heap.
Two Types of Heaps and their Properties:
There are two types of heaps as follows:
- Max-Heap: All elements in this heap satisfy the property that key of the parent node is greater than or equal to the keys of its child nodes. We will be discussing max-heaps in this article.
- Min-Heap: Elements in this heap satisfy the property that key of the parent node is less than or equal to the keys of its child nodes.
One significant structural property of heap (applicable to both min and max heap) is that height of the heap is given by ⌊lg(n)⌋, where n is the number of nodes in the heap. This property follows from the fact that a heap is a complete binary tree, and it will be used often when studying different operations performed on heap and their complexity.
Proof that height of a heap (or a complete binary tree) is given by ⌊lg(n)⌋:
[Insert image of min-heap here]
Implementation of Heaps using Arrays:
Generally, trees are implemented using pointers, and so heaps can also be designed using pointers. However, we know that heaps are complete binary trees and by leveraging this property, we can represent a heap using an array. For the sake of this article, we will be indexing our arrays from 1 rather than from 0.
Assume an integer array A is used to represent the heap (shown in figure 1) and each element of this array corresponds to a node of the heap. This array will have 2 attributes — A.size and A.length, which represent the size of the heap (number of elements currently in the heap) and the maximum number of elements that can be stored in the heap, respectively.
An array containing the elements shown in the heap in figure 1 will look like:
Thus, A[1] represents the root of the heap. This array representation enables us to access the left-child, right-child, and parent of any node by simply performing simple mathematical operations on its index. Assume we have a node at index i in the array A, then index of its parent can be calculated as ⌊(i/2)⌋. Index of its left-child and right-child can be calculated as (2*i) and (2*i + 1), respectively. In most programming languages, these operations can be implemented efficiently using bitwise operators. Therefore, an array representation is a space-efficient approach as we don’t need to store extra 3 pointers per node in the heap.
Thus, for a max-heap, we can say that A[i] ≥ A[2*i] and A[i] ≥ A[2*i + 1] (where i, (2*i), (2*i + 1) are all ≤ A.size). As we know, key of each node in a max-heap is greater than the key of its children, hence, the maximum key in the heap will be stored at the root, that is, at A[1].
Similarly, min-heap will satisfy the property that for any index i, A[i] ≤ A[2*i] and A[i] ≤ A[2*i + 1] (where i, (2*i), (2*i + 1) are all ≤ A.size). Thus, for a min-heap, minimum element will be at the root of the heap and thus, at A[1].
Operations Supported by Max-Heap:
Various operation supported by heap are described below on a high level, and will be covered in more detail in subsequent articles:
- FindMax(Array A): This operation returns the maximum value in the heap and its time complexity is O(1) as it just needs to return A[1].
- ExtractMax(Array A): This operation removes the maximum element from the heap and returns it. Time complexity of this operation is O(lg n) as we replace A[1] with A[A.size] — last element of the heap, and then do some operations to maintain the max-heap property.
- IncreaseKey(Array A, Index i, Integer v): This operation increases the value at index i in the array to value v. This operation is only valid if A[i] ≤ v, that is the new value is greater than the existing value at index i. This ensures that the subtree rooted at index i is still a max-heap. Complexity of this operation is O(lg n) as after increasing the key at index i, the max-heap property of Parent(i) might be violated and we might need to perform some operations to restore it.
- InsertKey(Array A, Integer v): This operation inserts the element v in the heap, and its complexity is O(lg n). To implement this operation, we add an element at the end of the heap (at A[A.size]) and then perform some operations to restore the heap property.
- DeleteKey(Array A, Index i): This operation is used to delete at element at index i, and complexity of this operation is O(lg n). To delete any element, we can replace it with the last element of the heap, and then again perform operations to restore the heap property in case it is violated.
Applications of Heap:
- Heaps are used to efficiently implement priority queue, an important data structure in computer science. One of the applications of priority queues is in process scheduling in operating systems.
- Heaps are used by the Heapsort Algorithm, which is one of the fastest sorting algorithms known. Its complexity is O(n * lg n).
- Heaps are also used in efficient implementations of algorithms like Dijkstra’s shortest-path algorithm, where we need to pick the node closest to a given node. If distances to all the nodes are stored in a heap then the closest node can be extracted efficiently using min-heap.
- Heaps provide an efficient way to get the kth smallest or largest element in an array.
References: Introduction to Algorithms (CLRS)