Binary Search Tree

Intuition

A binary search tree (BST) arranges values as a tree of nodes with one strict rule: every value in a node's left subtree is smaller than the node, every value in its right subtree is larger. That rule — the BST invariant — turns "find a value" into "walk one branch at a time": go left when too big, go right when too small, stop when equal.

If the tree is reasonably balanced, that walk is O(log n), just like binary search on a sorted array. The difference is that a BST also supports O(log n) inserts and deletes — the array's "stay sorted" cost is O(n), the BST's is one pointer rewiring.

How It Works

Three operations, all built on the same "walk left if smaller, right if larger" loop.

insert(value) — walk from the root following the BST rule. When the next slot you'd step into is None, attach a new node there. Duplicates: the displayed code returns silently without inserting.

search(value) — same walk. Return the node when values match, return None when you walk off the end.

delete(value) — find the node, then handle three cases:

  1. Leaf (no children) — just remove it. The parent's pointer becomes None.
  2. One child — splice the node out by promoting its only child into its position. The parent's pointer now bypasses the deleted node and lands on its child.
  3. Two children — replace the deleted node's value with its in-order successor: the smallest value in its right subtree. That successor lives at the leftmost node reachable by going right once, then left repeatedly. Once copied, delete the successor from the right subtree — which is guaranteed to be a leaf or one-child case, since the leftmost node has no left child by construction.

The two-children case is the subtle one. Why the in-order successor? Because it's the smallest value still larger than everything in the left subtree and largest still smaller than everything else in the right subtree — exactly the BST invariant the deleted node was maintaining. (Using the predecessor — largest of left subtree — works equally well, and the implementation is symmetric.)

The displayed code uses recursion for delete (_delete returns the rewired subtree root). That's the cleanest form, but the same logic can be written iteratively with parent pointers.

Step By Step

Starting empty, insert 50, 30, 70, 20, 40, 60, 80:

  1. insert(50) → root is None, so 50 becomes the root.
  2. insert(30) → walk from root 50. 30 < 50, go left. Left is None → attach as left child of 50.
  3. insert(70) → walk from 50. 70 > 50, go right. Right is None → attach.
  4. insert(20)20 < 50 left → 20 < 30 left → None → attach as left child of 30.
  5. After all seven inserts, the tree is:
             50
            /  \
          30    70
         /  \   /  \
        20  40 60  80
    
  6. search(40)40 < 50 left → 40 > 30 right → match.
  7. delete(20) → leaf case. 30.left = None.
  8. delete(30) → one-child case (only 40 remains as 30's child). Splice: 50.left = 40.
  9. delete(50) (root, two children) → find in-order successor: right once to 70, then left repeatedly — but 70.left = 60, 60.left = None. Successor is 60. Copy 60's value into the root, then recursively delete 60 from the right subtree (which is now the leaf case under 70). Final tree:
           60
          /  \
         40   70
               \
               80
    

In the visualization, the active walk path lights up as the cursor descends. Inserts highlight the "would-be" ghost slot before the attach. Deletes pulse the doomed node, then either prune (leaf), splice (one child), or copy-and-splice (two children) with explicit "find successor" walk into the right subtree.

Complexity

CaseTime
Insert, search, delete — balancedO(log n)
Insert, search, delete — worst case (degenerate)O(n)

Space: O(n) for the nodes themselves, plus O(h) recursion stack for the recursive delete (where h is tree height — log n balanced, n degenerate).

The catch: a vanilla BST does not self-balance. If you insert values in already-sorted order — 1, 2, 3, 4, 5 — every node attaches to the right of the previous one, producing a chain. The tree degenerates into a linked list and every operation collapses to O(n).

This is what motivates self-balancing BSTs: AVL trees, red-black trees, treaps, splay trees, B-trees. They all add bookkeeping on top of the basic BST operations to guarantee O(log n) even adversarial inputs. Most production "tree map" / "tree set" libraries are red-black trees under the hood (std::map in C++, TreeMap in Java).

Edge Cases

  • Empty treesearch returns None immediately. insert makes the new node the root. delete is a no-op.
  • Single-node tree, delete root — leaf case. Root becomes None.
  • Insert duplicate — the displayed code silently skips. Other conventions: count occurrences in the node, or store duplicates in the right subtree (treat >= as right). Pick one and stick with it — mixed conventions break search.
  • Delete the root — works the same as any other node; the recursive form just returns the rewired subtree as the new root.
  • Delete a value not in the tree — search walks off the end, recursive _delete returns None up the chain, the parent's pointer is rewritten to its current value (a no-op). Idempotent.

Common Mistakes

  • Forgetting the two-children case. Inserts are obvious; searches are obvious; the two-children delete is the one that catches everyone. You can't just "delete" the node — its children would be orphaned. You must promote a successor (or predecessor) value into its slot.
  • Using a left-subtree value as the successor. The in-order successor lives in the right subtree (smallest of values larger than the deleted node). Picking from the left subtree violates the BST invariant.
  • Picking a wrong-side leaf as the successor. The successor must be the leftmost of the right subtree, not any leaf. Picking an arbitrary leaf can put a value smaller than the deleted node's left subtree into the deleted node's slot.
  • Inserting sorted data and wondering why it's slow. Sorted input produces a degenerate tree. If your insert order isn't randomized, you need a self-balancing variant.
  • Confusing BST with heap. Both are binary trees, but their invariants are different. BSTs are ordered left-to-right; heaps are ordered top-to-bottom. A BST supports ordered traversal and range queries; a heap supports find-min / find-max in O(1) but not ordered iteration.