Binary Search

Intuition

Binary search finds a target in a sorted array by repeatedly cutting the search region in half. Look at the middle element. If it matches, done. If the target is larger, the answer must be in the right half — the left half can be discarded. If it's smaller, discard the right half. Each step throws away half of what remains, so even an array of a million elements is answered in twenty comparisons.

This only works because the array is sorted. The ordering is what lets "the middle element is X" tell you something about the half you didn't look at.

How It Works

Maintain a window [lo, hi] over the array — the slice still capable of containing the target. Initially lo = 0, hi = n − 1 (inclusive on both sides).

Each iteration:

  1. Compute the midpoint mid = (lo + hi) // 2.
  2. Compare nums[mid] against target.
    • Equal → return mid. Found.
    • nums[mid] < target → the target, if it exists, sits to the right. Set lo = mid + 1. The +1 is crucial: mid itself has been ruled out.
    • nums[mid] > target → set hi = mid - 1.
  3. Stop when lo > hi. The window has shrunk to nothing without finding the target → return −1.

The key invariant: at the top of every iteration, if the target is in the array, it lives in nums[lo..hi]. The two mid ± 1 updates preserve that invariant precisely because mid itself has just been inspected.

Step By Step

For the default input nums = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91], target = 23:

  1. lo = 0, hi = 9. mid = 4. nums[4] = 16. 16 < 23 → go right. lo = 5.
  2. lo = 5, hi = 9. mid = 7. nums[7] = 56. 56 > 23 → go left. hi = 6.
  3. lo = 5, hi = 6. mid = 5. nums[5] = 23. Match → return 5.

In the visualization, watch the window's left and right walls collapse inward. The mid pointer drops onto the centre of the active window. Cells outside [lo, hi] fade out — they've been proven not to contain the target. The matched cell turns green.

For a target that doesn't exist, the walls eventually cross (lo > hi), the window becomes empty, and the algorithm returns −1.

Complexity

CaseTime
Best (target at first probed mid)O(1)
AverageO(log n)
Worst (target absent, or at extreme)O(log n)

Space: O(1) — the iterative form uses two integer pointers and needs no recursion stack.

For n = 1,000,000, that's at most about 20 iterations. For n = 1,000,000,000, about 30. This logarithmic scaling is what makes binary search the workhorse it is — it's the difference between a microsecond and a minute.

Edge Cases

  • Empty arrayhi = −1, so lo <= hi is false on entry. Loop body never runs, return −1.
  • Single elementlo = hi = 0, one comparison decides.
  • Target equals first element — first mid lands in the middle. Algorithm shrinks the window leftward until lo = hi = 0, then matches.
  • Target equals last element — symmetric: shrinks rightward.
  • Target between two adjacent elements (not present) — the window collapses to one cell, that cell isn't the target, walls cross, return −1.
  • Duplicates — vanilla binary search returns the index of some matching element, not the first or last. If you need lower/upper bound, that's a different variant (bisect_left / bisect_right) with a slightly different loop condition and return.

Common Mistakes

  • Forgetting the +1 / -1 when moving the window. Writing lo = mid instead of lo = mid + 1 can cause an infinite loop when lo and hi are adjacent — mid rounds down to lo, the comparison sends control back into the same half, nothing changes.
  • lo < hi instead of lo <= hi. The inclusive <= is needed because lo and hi are inclusive bounds. Using < exits one iteration too early and misses targets in single-cell windows.
  • (lo + hi) // 2 overflow in fixed-width languages. In Java / C++, lo + hi can overflow a 32-bit int when both are large. The defensive form is lo + (hi - lo) // 2. Python integers are arbitrary precision, so the displayed code is safe.
  • Running binary search on unsorted data. The algorithm will happily return an index, but it'll be meaningless. Sortedness is a precondition, not a hint.
  • Using it when you'd be better off with a hash set. Binary search is O(log n); a hash lookup is O(1) average. If you can afford the O(n) space for a set and you only need membership, prefer the set. Binary search wins when you need positional info (insertion point, neighbor lookup) that a hash can't give.