algorithms · level 2

Binary Search

The halving walk — find 1 in a million with 20 probes.

150 XP

Binary Search

Sorted data hides a superpower. If your array is in order, you never have to scan it — you can teleport to the middle, check if your target is smaller or larger, and throw away half the array in one step. Keep halving and you've reduced a million-element search to twenty comparisons.

Analogy

Imagine flipping open a phonebook looking for "Parsons". You don't start at page 1 and read every name. You drop your thumb in the middle — say you land on "Mitchell". Parsons comes after Mitchell alphabetically, so the whole first half of the book is useless. Drop your thumb in the middle of what's left — "Roberts". Parsons comes before Roberts, so the back half is useless. Two probes in, you've already cut the book to a quarter. Three more and you're on the right page.

That's binary search. The only prerequisite is that the data is already sorted.

The halving argument

On each step, binary search cuts the candidate window in half. So after k steps, the window has size n / 2^k. The search terminates when the window shrinks to 1 (or to 0 for a miss). Solving n / 2^k = 1 for k gives k = log₂(n) — hence O(log n).

A log-factor doesn't sound dramatic until you plug in numbers:

`n`	Linear scan (worst)	Binary search (worst)
100	100	7
10,000	10,000	14
1,000,000	1,000,000	20
1,000,000,000	one billion	30

Every time you multiply n by 1000, binary search only needs about ten more probes. That's the entire value proposition of "sorted + binary-searchable" as a data-structure shape — it's why databases build B-tree indexes, why std::map is a tree, why sorted arrays outperform hash tables for range queries.

Tracing binary search by hand

Take the sorted array [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31] (n = 16) and search for 19:

Step	`lo`	`hi`	`mid`	`arr[mid]`	Decision	Next window
1	0	15	7	15	target > 15 → go right	`[8, 15]`
2	8	15	11	23	target < 23 → go left	`[8, 10]`
3	8	10	9	19	match!	return 9

Three probes to find one element in 16. Log₂(16) = 4, so we had room for one more. The pattern holds: no matter where 19 lives in a size-16 array, at most 4 probes find it.

The canonical loop (and its traps)

let lo = 0;
let hi = arr.length - 1;
while (lo <= hi) {
  const mid = lo + ((hi - lo) >> 1);
  if (arr[mid] === target) return mid;
  if (arr[mid] < target) lo = mid + 1;
  else hi = mid - 1;
}
return -1; // not found

Three things in that snippet that look trivial but are actually the entire lesson:

lo <= hi vs lo < hi. The inclusive form is the one you want when hi = length - 1; the strict form is the one you want when hi = length (a half-open convention some languages prefer). Mixing the two is the classic off-by-one that produces either an infinite loop or a missed last element.

lo + ((hi - lo) >> 1) instead of (lo + hi) / 2. The naive form can overflow when lo and hi are both large 32-bit ints. In 2006, Joshua Bloch pointed out this bug lived for nine years in the official Java Arrays.binarySearch. Write the overflow-safe form by reflex; it's no harder to read once you know what it's for.

lo = mid + 1 / hi = mid - 1, not lo = mid / hi = mid. When you've already compared arr[mid] and it wasn't the target, mid can't be the answer — so exclude it from the next window. Forgetting the ± 1 is the other classic off-by-one: the window stops shrinking and the loop runs forever.

Variants you'll hit in practice

"Does the value exist" is the textbook version. The versions you actually use in production are these:

Variant	Returns	Use case
Exact match	index of target, or −1	Dictionary lookup, "is this value present"
Lower bound	first index where `arr[i] ≥ target`	Insert point; "first score that qualifies"
Upper bound	first index where `arr[i] > target`	Insert-after point; half-open range end
Rotated array	index after an unknown pivot rotation	Classic interview question, real in log-shifted sorted buffers
First-bad-version	first `i` where `pred(i) === true`	Git bisect, version rollouts, release probes
Answer search	smallest parameter satisfying a predicate	"What's the smallest capacity that fits?" / "What's the minimum speed?" — the predicate is monotone

The last two are the most general: binary-search-over-a-predicate. You're not searching an array at all; you're searching a function that returns true after some point and false before. git bisect is literally this — bisecting commit history over "does this commit have the bug".

What standard libraries ship

Most languages don't ship naive binary search either — they ship variants that return insertion points, because those are strictly more useful:

Language / lib	API	What it returns
Python `bisect`	`bisect_left`, `bisect_right`	Lower/upper bound insertion point
C++ `<algorithm>`	`std::lower_bound`, `std::upper_bound`, `std::binary_search`	Insertion iterator; `binary_search` returns `bool` only
Java	`Arrays.binarySearch(arr, key)`	Negative value encodes insertion point when miss (`-(ip) - 1`)
Go	`sort.Search(n, pred)`	Smallest `i` where `pred(i)` is true (predicate form)
Rust	`slice::binary_search`	`Result<usize, usize>` — index on hit, insertion point on miss
JavaScript	— (no stdlib)	Write your own; Lodash has `sortedIndex`

JavaScript's omission is notable — the stdlib has sort but no binary search. That's why you'll see hand-rolled binary searches throughout production JS codebases.

When binary search breaks

Binary search has a single assumption: the array is sorted according to the comparison function you're using. When that breaks, everything breaks:

Unsorted input. Binary search on unsorted data returns nonsense — no errors, no warnings, just a wrong answer. Always confirm your data is actually sorted, especially when it's almost sorted (a single out-of-place element is enough).
Duplicates + "any match is fine" vs "leftmost match" vs "rightmost match". The simple loop above returns some index with the matching value, but not necessarily the first or last occurrence. If you need a specific duplicate, you need lower_bound / upper_bound.
Approximate / "closest" matches. The basic loop only finds exact hits. To find "the closest value ≤ target" you need to track the best candidate as the window shrinks; it's still O(log n), but the code is no longer a one-liner.
Non-comparable keys. Binary search needs a total order. Objects with multiple fields need an explicit comparator that imposes one; dicts/objects with unordered keys aren't searchable at all.

Common gotchas

Integer overflow in mid. If you're ever working in a language with fixed-width ints (Go, Rust, C, C++, Java int), use lo + (hi - lo) / 2. JS numbers don't have this problem for arrays but you should write it the safe way out of habit.
Search on a wrong comparator. Sorting by (a, b) => a - b and then searching with === works. Sorting by any non-numeric comparator and searching with === may or may not work depending on whether === agrees with the total order. Use the same comparator for both phases.
Binary searching a Set or Map. Hash-backed collections have no order. Binary search needs an array (or tree) view. Don't be the person who calls binarySearch(new Set([...])) and wonders why it's broken.
The mid index in an empty window. When hi = lo - 1 (empty window), the loop condition lo <= hi is false, so no probe fires. That's the correct behavior — miss returns -1 — but it's worth checking once by hand when implementing a variant.

Playground

Pick a target with the slider or the buttons, press Play, and watch the window close in. The [lo, hi] range is shaded; the current mid is painted in var(--color-accent). Every probe cuts the shaded region in half. On a hit, the array flashes var(--color-success); on a miss, var(--color-warning). The step counter at the bottom is the one the challenge asks about.

Visualizer

The Halving Window panel plots "how many probes to find one element in n" for both linear scan (n) and binary search (log₂ n). Toggle the scale to crank n up to one million: the linear curve runs off the top of the chart; the binary curve is still a modest slope at 20. That picture — the exponential gap between linear and log — is the intuition to carry out of this level.