Heap

Intuition

A heap answers one question fast: what's the smallest (or largest) value in this collection? Not in O(n) by scanning, not in O(log n) by sorting — in O(1). The trick is to organize the data so that the answer is always sitting at a known spot: the root.

A binary heap is a complete binary tree with one rule, the heap property: every parent's value is its children's (min-heap) or its children's (max-heap). The root is therefore the global extremum — the smallest in a min-heap, the largest in a max-heap.

What makes heaps practical is that the tree fits inside an array. No node objects, no pointers — just index arithmetic. The node at index i has children at 2i + 1 and 2i + 2, and its parent is at (i − 1) // 2. The tree is implicit; the array is the data.

How It Works

The displayed code is a min-heap by default; a kind flag swaps the comparator to make it a max-heap.

Three operations:

peek() — return data[0]. The root. O(1).

insert(val) — append the value at the end of the array, then sift up: while the new value is smaller than its parent, swap with the parent and move up one level. Stop when you hit the root or when the parent is already smaller. O(log n).

extract_top() — read data[0] (the value to return), move the last element of the array into slot 0, shrink the array by one, then sift down: compare the root with its two children, swap with the smaller (for a min-heap), and recurse into that subtree. Stop when neither child is smaller. O(log n).

The "move last to root, then sift down" pattern is the elegant move. Removing the root would leave a hole; instead, we fill the hole with the last leaf (which is structurally easy to remove from a complete tree) and then push it down to where the heap property holds again.

Step By Step

Starting with an empty min-heap, insert(5, 3, 8, 1, 9, 2):

  1. insert(5) → array [5]. Single node, heap property trivially holds.
  2. insert(3) → append → [5, 3]. Sift up: child 3 < parent 5 → swap → [3, 5].
  3. insert(8) → append → [3, 5, 8]. Sift up: 8 > parent 3, stop. Heap is [3, 5, 8].
  4. insert(1) → append → [3, 5, 8, 1]. Sift up: 1 < parent 5 → swap → [3, 1, 8, 5]. 1 < parent 3 → swap → [1, 3, 8, 5].
  5. insert(9) → append → [1, 3, 8, 5, 9]. Sift up: 9 > parent 3, stop.
  6. insert(2) → append → [1, 3, 8, 5, 9, 2]. Sift up: 2 < parent 8 → swap → [1, 3, 2, 5, 9, 8]. 2 > parent 1, stop. Final heap:
             1
            / \
           3   2
          / \  /
         5  9 8
    
  7. extract_top() → return 1. Move last 8 to root → [8, 3, 2, 5, 9]. Sift down: children 3 and 2; smaller is 2 → swap → [2, 3, 8, 5, 9]. New children of 8 (at index 2) are out of bounds — stop. Result: 1 extracted, heap is now [2, 3, 8, 5, 9].

In the visualization, the heap renders as both an array row and a tree above it. Sift-up walks the active node upward along the parent edge, swapping each level. Sift-down walks downward, always choosing the smaller child.

Complexity

OperationTime
peekO(1)
insertO(log n)
extract_topO(log n)
Build heap from n items (heapify)O(n)

Space: O(n) for the array. No pointer overhead, no node allocation — the tree structure is implicit in array indexing.

The O(n) heapify is the surprising one. Building a heap by repeated insert is O(n log n). But there's a clever bottom-up build — sift-down on each internal node from the last to the first — that runs in O(n) overall, because the deep nodes (where most of the work could happen) only have shallow subtrees to sift through.

Heaps are the engine behind:

  • Priority queues — schedule the next event with the earliest timestamp.
  • Heap sort — repeatedly extract the root into the back of the array. O(n log n), in place, not stable.
  • Top-K queries — maintain a max-heap of size K; for each new element, push and pop. O(n log K).
  • Dijkstra's algorithm — pull the next node by shortest tentative distance.

Edge Cases

  • Empty heappeek returns None. extract_top raises underflow.
  • Single-element heappeek returns that element. extract_top returns it and leaves the heap empty; the "move last to root" branch is skipped because there's nothing to move.
  • Duplicates — heaps allow duplicates. Equal values can sit parent-to-child in either order — the heap property is , not <.
  • Building from an already-sorted array — already a valid heap (every parent precedes its children in sorted order, which implies the heap property for a min-heap). Heapify on sorted input does zero work.
  • Insertion at index 0 — the visualization handles this via sift-up, which terminates immediately when i = 0 (parent index would be negative).

Common Mistakes

  • Treating a heap as sorted. It isn't. A heap guarantees the root is the extremum and the heap property holds, but siblings and cousins can be in any order. Iterating data[0..n − 1] does not give sorted output — extracting one at a time does.
  • Forgetting to sift after a modification. Changing the value of a node mid-tree violates the heap property in both directions potentially. You need a "sift in whichever direction restores the property" routine (decrease-key for min-heaps).
  • Picking the wrong child on sift down. For a min-heap you must swap with the smaller child, not "either child". Swapping with the larger one re-violates the heap property and the sift loop won't terminate correctly.
  • Mismatched comparator direction. Min-heap and max-heap differ only in the comparator. Flipping < to > (or vice versa) silently turns a min-heap into a max-heap and breaks every caller expecting the other ordering.
  • Implementing the tree with pointers. It works, but it gives up the array's cache locality and adds O(n) pointer overhead. The array-indexed form is the canonical heap for a reason — array access is fast, contiguous, and free of allocation.
  • Reaching for a heap when a BST is what you need. Heaps don't support efficient search-by-value or in-order traversal. If your use case is "find this specific key, or list everything in order", you want a balanced BST. Heaps win only when the access pattern is purely "give me the extremum".