Binary Search Tree
Intuition
A binary search tree (BST) arranges values as a tree of nodes with one strict rule: every value in a node's left subtree is smaller than the node, every value in its right subtree is larger. That rule — the BST invariant — turns "find a value" into "walk one branch at a time": go left when too big, go right when too small, stop when equal.
If the tree is reasonably balanced, that walk is O(log n), just like
binary search on a sorted array. The difference is that a BST also
supports O(log n) inserts and deletes — the array's "stay sorted"
cost is O(n), the BST's is one pointer rewiring.
How It Works
Three operations, all built on the same "walk left if smaller, right if larger" loop.
insert(value) — walk from the root following the BST rule. When
the next slot you'd step into is None, attach a new node there.
Duplicates: the displayed code returns silently without inserting.
search(value) — same walk. Return the node when values match,
return None when you walk off the end.
delete(value) — find the node, then handle three cases:
- Leaf (no children) — just remove it. The parent's pointer
becomes
None. - One child — splice the node out by promoting its only child into its position. The parent's pointer now bypasses the deleted node and lands on its child.
- Two children — replace the deleted node's value with its
in-order successor: the smallest value in its right subtree.
That successor lives at the leftmost node reachable by going
rightonce, thenleftrepeatedly. Once copied, delete the successor from the right subtree — which is guaranteed to be a leaf or one-child case, since the leftmost node has no left child by construction.
The two-children case is the subtle one. Why the in-order successor? Because it's the smallest value still larger than everything in the left subtree and largest still smaller than everything else in the right subtree — exactly the BST invariant the deleted node was maintaining. (Using the predecessor — largest of left subtree — works equally well, and the implementation is symmetric.)
The displayed code uses recursion for delete (_delete returns the
rewired subtree root). That's the cleanest form, but the same logic
can be written iteratively with parent pointers.
Step By Step
Starting empty, insert 50, 30, 70, 20, 40, 60, 80:
insert(50)→ root isNone, so50becomes the root.insert(30)→ walk from root50.30 < 50, go left. Left isNone→ attach as left child of50.insert(70)→ walk from50.70 > 50, go right. Right isNone→ attach.insert(20)→20 < 50left →20 < 30left →None→ attach as left child of30.- After all seven inserts, the tree is:
50 / \ 30 70 / \ / \ 20 40 60 80 search(40)→40 < 50left →40 > 30right → match.delete(20)→ leaf case.30.left = None.delete(30)→ one-child case (only40remains as30's child). Splice:50.left = 40.delete(50)(root, two children) → find in-order successor: right once to70, then left repeatedly — but70.left = 60,60.left = None. Successor is60. Copy60's value into the root, then recursively delete60from the right subtree (which is now the leaf case under70). Final tree:60 / \ 40 70 \ 80
In the visualization, the active walk path lights up as the cursor descends. Inserts highlight the "would-be" ghost slot before the attach. Deletes pulse the doomed node, then either prune (leaf), splice (one child), or copy-and-splice (two children) with explicit "find successor" walk into the right subtree.
Complexity
| Case | Time |
|---|---|
| Insert, search, delete — balanced | O(log n) |
| Insert, search, delete — worst case (degenerate) | O(n) |
Space: O(n) for the nodes themselves, plus O(h) recursion stack
for the recursive delete (where h is tree height — log n
balanced, n degenerate).
The catch: a vanilla BST does not self-balance. If you insert values
in already-sorted order — 1, 2, 3, 4, 5 — every node attaches to
the right of the previous one, producing a chain. The tree degenerates
into a linked list and every operation collapses to O(n).
This is what motivates self-balancing BSTs: AVL trees, red-black
trees, treaps, splay trees, B-trees. They all add bookkeeping on
top of the basic BST operations to guarantee O(log n) even
adversarial inputs. Most production "tree map" / "tree set" libraries
are red-black trees under the hood (std::map in C++,
TreeMap in Java).
Edge Cases
- Empty tree —
searchreturnsNoneimmediately.insertmakes the new node the root.deleteis a no-op. - Single-node tree, delete root — leaf case. Root becomes
None. - Insert duplicate — the displayed code silently skips. Other
conventions: count occurrences in the node, or store duplicates
in the right subtree (treat
>=as right). Pick one and stick with it — mixed conventions break search. - Delete the root — works the same as any other node; the recursive form just returns the rewired subtree as the new root.
- Delete a value not in the tree — search walks off the end,
recursive
_deletereturnsNoneup the chain, the parent's pointer is rewritten to its current value (a no-op). Idempotent.
Common Mistakes
- Forgetting the two-children case. Inserts are obvious; searches are obvious; the two-children delete is the one that catches everyone. You can't just "delete" the node — its children would be orphaned. You must promote a successor (or predecessor) value into its slot.
- Using a left-subtree value as the successor. The in-order successor lives in the right subtree (smallest of values larger than the deleted node). Picking from the left subtree violates the BST invariant.
- Picking a wrong-side leaf as the successor. The successor must be the leftmost of the right subtree, not any leaf. Picking an arbitrary leaf can put a value smaller than the deleted node's left subtree into the deleted node's slot.
- Inserting sorted data and wondering why it's slow. Sorted input produces a degenerate tree. If your insert order isn't randomized, you need a self-balancing variant.
- Confusing BST with heap. Both are binary trees, but their
invariants are different. BSTs are ordered left-to-right; heaps
are ordered top-to-bottom. A BST supports ordered traversal and
range queries; a heap supports
find-min/find-maxinO(1)but not ordered iteration.