Heap Sort
Intuition
Heap sort rests on one observation: any array is already a complete binary
tree. Slot 0 is the root, and the children of slot i live at 2i + 1
and 2i + 2 — no pointers, no extra memory, just index arithmetic. If we
rearrange the array so every parent is at least as large as its children (a
max-heap), the largest value is always sitting at slot 0. Sorting then
becomes a loop of cheap moves: swap the root with the last heap slot, declare
that slot finished, and let the new root sink back to its proper depth. The
array eats itself from the right — the heap region shrinks while the sorted
tail grows — and no auxiliary array ever appears.
How It Works
The algorithm has two phases that share one workhorse, sift_down: given a
subtree whose children are already valid heaps, it compares the root with its
larger child, swaps downward if that child is bigger, and repeats until the
heap property holds. A node can fall at most the height of the tree, so one
sift costs at most ⌊log₂ n⌋ swaps.
Phase 1 — build. Walk the parents from the last parent
(⌊n/2⌋ − 1) back to the root, sifting each one down. Going bottom-up means
every subtree below the current node is already a heap, so a single pass
suffices. Counterintuitively this whole phase costs only O(n): most nodes sit
near the bottom and have almost no room to fall.
Phase 2 — sort. For end = n − 1 down to 1: swap the root (the
maximum of the remaining heap) with arr[end], shrink the heap by one so that
slot is final, and sift the new root down within the smaller heap. Each of the
n − 1 extractions is one swap plus at most a log-depth sift — O(n log n)
total, in every case.
The animation shows both views of the same array at once: circles in a tree
(the view of meaning) and cells in a row (the view of storage). Every swap
moves in both views simultaneously, and the heap | sorted divider slides
left one slot per extraction.
Step By Step
With the default input [38, 12, 56, 7, 41, 23, 5] (n = 7):
- View as tree. The seven cells rise into a tree: 38 at the root, 12 and 56 below it, then 7, 41, 23, 5. Nothing moves in the array — the tree IS the array.
- Build,
i = 2. Last parent is⌊7/2⌋ − 1 = 2(value 56). Children 23 and 5 are smaller — already settled. - Build,
i = 1. 12 vs children 7 and 41: 41 is larger, swap. 12 lands on a leaf — settled. Array:[38, 41, 56, 7, 12, 23, 5]. - Build,
i = 0. 38 vs children 41 and 56: 56 wins, swap. 38's new children 23 and 5 are smaller — settled. The heap is built:[56, 41, 38, 7, 12, 23, 5], maximum at the root. - Extract #1. Swap root 56 with
arr[6] = 5; slot 6 is final and the divider appears. Now 5 must sift from the root: 41 is the larger child — swap; then 12 beats 5 — swap again. Two levels of sinking — this is the ⌊log₂ n⌋ cost made visible. Heap:[41, 12, 38, 7, 5, 23]. - Extracts #2–#6 repeat the same rhythm — one swap to the boundary, one
shrink, a short sift — finalizing 41, 38, 23, 12, then 7. The remaining
5 is trivially sorted, leaving
[5, 7, 12, 23, 38, 41, 56].
Watch the divider: every extraction moves it exactly one slot left, and the
green tail never reorders again. And try the contrast input
56, 41, 38, 23, 19, 12, 7 — a reverse-sorted array is already a max-heap,
so the entire build phase performs zero swaps.
Complexity
| Case | Comparisons | Space |
|---|---|---|
| Best | O(n log n) | O(1) |
| Average | O(n log n) | O(1) |
| Worst | O(n log n) | O(1) |
Build-heap alone is O(n). Unlike quicksort there is no bad input — the heap's height bounds every sift — and unlike merge sort there is no auxiliary array. The trade-off: heap sort is not stable (equal values can pass each other during long-distance swaps), and its scattered index jumps are cache-unfriendly in practice.
Edge Cases
- Empty or single-element array — both loops run zero times; the array is already sorted.
- Reverse-sorted (descending) input — already a max-heap; build does no swaps, but the sort phase still does its full n log n work. Heap sort is not adaptive.
- Already-sorted (ascending) input — close to the worst case for build: small values sit near the root and must sink far.
- Duplicates — ties choose the parent (strict
>comparison), so equal values cause no extra swaps; their relative order may still change (instability).
Common Mistakes
- Building top-down instead of bottom-up. Sifting parents from the root
downward doesn't work — a child subtree that isn't a heap yet can push a
wrong value up. Start at the last parent
⌊n/2⌋ − 1and walk backwards. - Forgetting to shrink the heap. Calling
sift_down(arr, n, 0)instead ofsift_down(arr, end, 0)after the swap pulls already-sorted values back into the heap. - Comparing with only one child. The root must swap with the larger of its two children; picking the left child unconditionally breaks the heap property on the other side.
- Off-by-one in the child indices. With 0-based slots the children are
2i + 1and2i + 2; using2isilently builds a 1-based heap on a 0-based array. - Stopping the sort loop at
end = 0. The last remaining element is necessarily the minimum — looping down to1is enough.
These lessons are still being refined and may contain mistakes.