Binary Search
Intuition
Binary search finds a target in a sorted array by repeatedly cutting the search region in half. Look at the middle element. If it matches, done. If the target is larger, the answer must be in the right half — the left half can be discarded. If it's smaller, discard the right half. Each step throws away half of what remains, so even an array of a million elements is answered in twenty comparisons.
This only works because the array is sorted. The ordering is what lets
"the middle element is X" tell you something about the half you
didn't look at.
How It Works
Maintain a window [lo, hi] over the array — the slice still capable
of containing the target. Initially lo = 0, hi = n − 1 (inclusive
on both sides).
Each iteration:
- Compute the midpoint
mid = (lo + hi) // 2. - Compare
nums[mid]againsttarget.- Equal → return
mid. Found. nums[mid] < target→ the target, if it exists, sits to the right. Setlo = mid + 1. The+1is crucial:miditself has been ruled out.nums[mid] > target→ sethi = mid - 1.
- Equal → return
- Stop when
lo > hi. The window has shrunk to nothing without finding the target → return−1.
The key invariant: at the top of every iteration, if the target is
in the array, it lives in nums[lo..hi]. The two mid ± 1 updates
preserve that invariant precisely because mid itself has just been
inspected.
Step By Step
For the default input nums = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91],
target = 23:
lo = 0,hi = 9.mid = 4.nums[4] = 16.16 < 23→ go right.lo = 5.lo = 5,hi = 9.mid = 7.nums[7] = 56.56 > 23→ go left.hi = 6.lo = 5,hi = 6.mid = 5.nums[5] = 23. Match → return5.
In the visualization, watch the window's left and right walls collapse
inward. The mid pointer drops onto the centre of the active window.
Cells outside [lo, hi] fade out — they've been proven not to contain
the target. The matched cell turns green.
For a target that doesn't exist, the walls eventually cross
(lo > hi), the window becomes empty, and the algorithm returns
−1.
Complexity
| Case | Time |
|---|---|
Best (target at first probed mid) | O(1) |
| Average | O(log n) |
| Worst (target absent, or at extreme) | O(log n) |
Space: O(1) — the iterative form uses two integer pointers and
needs no recursion stack.
For n = 1,000,000, that's at most about 20 iterations. For
n = 1,000,000,000, about 30. This logarithmic scaling is what makes
binary search the workhorse it is — it's the difference between a
microsecond and a minute.
Edge Cases
- Empty array —
hi = −1, solo <= hiis false on entry. Loop body never runs, return−1. - Single element —
lo = hi = 0, one comparison decides. - Target equals first element — first
midlands in the middle. Algorithm shrinks the window leftward untillo = hi = 0, then matches. - Target equals last element — symmetric: shrinks rightward.
- Target between two adjacent elements (not present) — the window
collapses to one cell, that cell isn't the target, walls cross,
return
−1. - Duplicates — vanilla binary search returns the index of some
matching element, not the first or last. If you need lower/upper
bound, that's a different variant (
bisect_left/bisect_right) with a slightly different loop condition and return.
Common Mistakes
- Forgetting the
+1/-1when moving the window. Writinglo = midinstead oflo = mid + 1can cause an infinite loop whenloandhiare adjacent —midrounds down tolo, the comparison sends control back into the same half, nothing changes. lo < hiinstead oflo <= hi. The inclusive<=is needed becauseloandhiare inclusive bounds. Using<exits one iteration too early and misses targets in single-cell windows.(lo + hi) // 2overflow in fixed-width languages. In Java / C++,lo + hican overflow a 32-bit int when both are large. The defensive form islo + (hi - lo) // 2. Python integers are arbitrary precision, so the displayed code is safe.- Running binary search on unsorted data. The algorithm will happily return an index, but it'll be meaningless. Sortedness is a precondition, not a hint.
- Using it when you'd be better off with a hash set. Binary search
is
O(log n); a hash lookup isO(1)average. If you can afford theO(n)space for a set and you only need membership, prefer the set. Binary search wins when you need positional info (insertion point, neighbor lookup) that a hash can't give.