# Heaps

## Introduction

A *heap* is a tree data structure where a *heap property* is fulfilled:

- For a
*max-heap*, every node’s value is greater than or equal to each of their immediate children. - For a
*min-heap*, every node’s value is lesser than or equal to their immediate children.

A *binary heap* is a heap with the form of a binary tree that also fulfills the *shape property*, requiring the tree to also form a *complete binary tree*, meaning:

- all levels except the last are completely filled, and
- the last level is filled from left to right.

Heaps are commonly used to implement the __ priority queue__ ADT.

Binary heaps are commonly efficiently implemented using an array. Notably, each row appears in the array in order from left to right. Since the binary heap is a complete binary tree, the array is a compact representation with no missing nodes before the last node.

Example of a *binary max-heap* data structure (colours are used only to help show structure):

Some common operations:

Operation | Runtime | Basic Summary |
---|---|---|

$\text{Push}\parens{\textit{item}}$ | $O\parens{\log n}$ | Add a single item. |

$\text{Pop}\parens{} \to \textit{item}$ | $O\parens{\log n}$ | Extract and remove the root of the heap (i.e. the biggest item of a max-heap, or the smallest item of a min-heap). |

$\text{Push-Pop}\parens{\textit{item}} \to \textit{item}$ | $O\parens{\log n}$ | Equivalent to a push and then a pop, in that order. Typically more efficient than using the individual operations. |

$\text{Pop-Push}\parens{\textit{item}} \to \textit{item}$ | $O\parens{\log n}$ | Equivalent to a pop and then a push, in that order. Typically more efficient than using the individual operations. |

$\text{Heapify}\parens{}$ | $O\parens{n}$ | Convert an arbitrary complete binary tree into a valid heap. |

$\text{Get-Size}\parens{} \to \textit{size}$ | $O\parens{1}$ | Get the number of items in the heap. |

(internal) $\text{Sift-Up}\parens{\textit{node}}$ | $O\parens{\log n}$ | Moves a node up the tree until it is in a valid position. |

(internal) $\text{Sift-Down}\parens{\textit{node}}$ | $O\parens{\log n}$ | Moves a node down the tree until it is in a valid position. |

## Array Indexing

For __0-indexed arrays__, given an index $i$, we can find the indices of its left child $c_1$, right child $c_2$, and parent $p$ with:

To quickly derive these equations, let’s try using a 1-indexed array:

With 1-indexing, the pattern is much more obvious. Using $i'$, $c'_1$, $c'_2$, and $p'$ as our 1-indexed array indices, we observe that the following are true:

To convert back to 0-indexing, we apply $i' = i + 1$ and such:

## Push and Sift-Up

The *push* operation inserts a new item into the heap. The algorithm:

- Add the new item to the next free space while still forming a complete binary tree.
- Run
*sift-up*starting from the new node to restore the heap property.

The *sift-up* operation moves a node up the tree until it is in the correct position. The algorithm:

- Compare the current node with its parent. If they are in the correct order, then we are done.
- Otherwise, swap them, move to the new position of our node, and repeat
*sift-up*.

Both algorithms have $O\parens{\log n}$ runtime.

Sample implementation:

```
def hpush(heap, item):
heap.append(item)
sift_up(heap, len(heap) - 1)
def sift_up(heap, i):
parent = (i - 1) >> 1
if (i == 0) or (heap[parent] > heap[i]):
return
(heap[parent], heap[i]) = (heap[i], heap[parent])
sift_up(heap, parent)
```

Sample driver code:

```
# Start with a valid heap
heap = [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
# We push 92 to the heap
hpush(heap, 92)
print(heap) # [94, 87, 92, 57, 68, 81, 5,
# 35, 36, 29, 41, 18, 74]
```

TODO: Provide a visual explainer of what’s happening?
## Pop and Sift-Down

The *pop* operation extracts and removes the root of the heap. For a max-heap, the extracted value is the largest item, or for a min-heap, the extracted value is the smallest item. The algorithm:

- Store the current root item. This will be returned later.
- Move the bottom-leftmost item into the original place of root.
- Run
*sift-down*starting from the new root to restore the heap property.

The *sift-down* operation moves a node down the tree until it is in the correct position. The algorithm:

- If our current node is in the correct order with its children, then we are done.
- Otherwise, swap the current node and one of the children it’s out-of-order with. For a max-heap, we must swap with the larger child, and for a min-heap, we must swap with the smaller child, otherwise these nodes will still be out of order.
- Move to our node’s new location, and repeat
*sift-down*.

Both algorithms have $O\parens{\log n}$ runtime.

Sample implementation:

```
# PRECONDITION: len(heap) > 0
def hpop(heap):
to_return = heap[0]
heap[0] = heap.pop()
sift_down(heap, 0)
return to_return
def sift_down(heap, i):
left = (i << 1) + 1
if left >= len(heap):
return
right = left + 1
if (right >= len(heap)) or (heap[left] > heap[right]):
next_child = left
else:
next_child = right
if heap[next_child] <= heap[i]:
return
(heap[next_child], heap[i]) = (heap[i], heap[next_child])
sift_down(heap, next_child)
```

Sample driver code:

```
# Start with a valid heap
heap = [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
result = hpop(heap)
print(result) # 94
print(heap) # [87, 68, 81, 57, 41, 74, 5, 35, 36, 29, 18]
```

## Push-Pop and Pop-Push

The *push-pop* operation is equivalent to a *push* and then a *pop*, in that order. The algorithm:

- If our new item is greater than or equal to the current root (for a max-heap) or less than or equal to the current root (for a min-heap), then we are done. No need to do any further operations on the heap. The new item is also the popped item.
- Store the current root. This will be returned later.
- Replace the root node with the new item.
- Perform
*sift-down*starting from the root node.

The *pop-push* operation is the other way around (a *pop* followed by a *push*). The algorithm:

- Store the current root. This will be returned later.
- Replace the root node with the new item.
- Perform
*sift-down*starting from the root node.

*Push-pop* and *pop-push* are typically more efficient than using the individual *push* and *pop* operations together. Both operations have $O\parens{\log n}$ runtime.

Sample implementation and driver code for *push-pop*:

```
# PRECONDITION: len(heap) > 0
def hpushpop(heap, item):
if item < heap[0]:
(item, heap[0]) = (heap[0], item)
sift_down(heap, 0)
return item
# sift_down is defined elsewhere.
# A valid heap
template = [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
heap = template.copy()
result = hpushpop(heap, 37)
print(result) # 94
print(heap) # [87, 68, 81, 57, 41, 74, 5, 35, 36, 29, 37, 18]
heap = template.copy()
result = hpushpop(heap, 95)
print(result) # 95
print(heap) # [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
```

Sample implementation and driver code for *pop-push*:

```
# PRECONDITION: len(heap) > 0
def hpoppush(heap, item):
(item, heap[0]) = (heap[0], item)
sift_down(heap, 0)
return item
# sift_down is defined elsewhere.
# The same valid heap as the push-pop example
template = [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
heap = template.copy()
result = hpoppush(heap, 37)
print(result) # 94
print(heap) # Same as for hpushpop(heap, 37)
heap = template.copy()
result = hpoppush(heap, 95)
print(result) # 94
print(heap) # [95, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
```

TODO: Provide a visual explainer of what’s happening?
## Heapify

The *heapify* operation converts an arbitrary *complete binary tree* into a valid heap. The algorithm simply runs *sift-down* on array nodes $\brackets{\floor{n / 2}, \dots, 1, 0}$, in that order.

This algorithm runs in $O\parens{n}$.

Sample implementation:

```
def heapify(heap):
for i in range(len(heap) >> 1, 0, -1):
sift_down(heap, i - 1)
```

Sample driver code:

```
# Start with an arbitrary complete binary tree
heap = [18, 35, 74, 36, 29, 81, 5, 57, 87, 68, 41, 94]
heapify(heap)
print(heap) # [94, 87, 81, 57, 68, 74, 5, 35, 36, 29, 41, 18]
```

TODO: Provide a visual explainer of what’s happening?

TODO: Explain the naive algorithm of repeated push operations, and another alternative algorithm that uses sift-up. Explain why sift-down is the best algorithm.

## Further Reading

TODO: Reference heap sort

TODO: Standard implementations?

## References

**Wikipedia: Heap (data structure)**: The worded definition was taken from here.**Wikipedia: Binary Heap**: Algorithms were taken from here.