foundations · level 6

Debugging Mindset

Print, debugger, bisect, and the discipline of the smallest failing test case.

175 XP

Debugging Mindset

Debugging is the activity engineers spend the most time on and learn the least about. Most of us pick up a few tricks in the first two years and then stop improving. This lesson is the missing chapter.

Analogy

Debugging is detective work. You arrive after the crime — the program has already crashed, the data is already corrupted, the user has already screamed. Your job is to reconstruct what happened from the evidence. The stack trace is the body. The logs are the witness statements. The git history is the alibi sheet. Bad detectives jump to the first suspect; good ones rule out half the city before naming anyone.

The 4 techniques every engineer needs

Technique When it shines When it fails
Print debugging Bug only reproduces in CI/prod, debuggers can't attach Hot loops where prints flood the log
Interactive debugger Local repro available, you want to step through state Heisenbugs that vanish under the debugger
Binary search (git bisect) The bug is a regression — it used to work Code that has always been broken
Smallest failing test case Always — this is the meta-technique When the bug is fundamentally non-local (e.g. a race condition)

You'll use all four in your career. Most engineers default to one and never reach for the others.

Print debugging

Print is not stupid. It's the universal debugger — works in any language, any environment, any deployment. The trick is to make prints structured so you can actually read them back.

Bad:

print("here")
print("here2")
print(x)

Good:

print(f"[order={order.id}] entering validate, items={len(order.items)} total={order.total}")
print(f"[order={order.id}] stage=validated")

The grep test: would you be able to find the right output line in a 10,000-line log? If not, the print is too vague.

A real logger (pino, logging, zap, slog) gives you all the structure for free, with log levels you can turn off in prod without removing the lines. Cultivate the habit of log.debug({ ... }) instead of console.log.

Interactive debuggers

Every modern language has one:

Language Debugger Modern UI
JavaScript / TS Chrome DevTools VS Code attach
Python pdb, pudb VS Code, PyCharm
Go dlv VS Code, Goland
Rust gdb, lldb, rust-gdb VS Code with rust-analyzer
Java jdb IntelliJ

The bar to clear: set a breakpoint, step in, step over, inspect a variable, evaluate an expression. That's 80% of debugger use. Spend a Saturday afternoon learning the keybindings for those five operations in your daily-driver language. You will save it back ten times over.

Binary search — git bisect

When you have a regression — "it used to work, now it doesn't" — and the history between known-good and known-bad has more than ~10 commits, git bisect is dramatically faster than reading every commit:

git bisect start
git bisect bad                    # the current commit is broken
git bisect good v1.4.0            # this tag was working
# git checks out a midpoint commit
./scripts/repro.sh
# returned non-zero → the bug is here or earlier
git bisect bad
# git checks out another midpoint
# repeat

After log₂(N) probes, git tells you the exact commit that introduced the regression. For a thousand commits, that's ten probes. With a deterministic repro script, you can let it run unattended:

git bisect run ./scripts/repro.sh

Exit codes the script must use: 0 = good, non-zero (1–124, 126–127) = bad, 125 = skip (e.g. compile failure on this commit).

Reading a stack trace

A stack trace is a snapshot of the call stack at the moment the program crashed. Read it in this order:

  1. The exception type and message at the top. This tells you what went wrong (NullPointerException, KeyError: 'user', SyntaxError).
  2. The top frame. This tells you where the program tried to do the impossible.
  3. The first frame in your code (skipping framework / library frames). This tells you what your code asked for that led to the impossible thing.

Common pitfalls:

  • Reading bottom-to-top is a Java/Python habit. Some languages print the stack inverted. Check yours.
  • The error message is sometimes the answer. "Cannot read property 'name' of undefined" — you have a thing that's undefined, and you tried to read .name from it. The bug is upstream — what produced the undefined?
  • Async stack traces are often misleading. The frame where the promise was created may be more useful than where it rejected. Use --async-stack-traces (Node) or equivalent.

The smallest failing test case

This is the technique that separates seniors from everyone else. When you have a bug, the first move is not to start fixing — it's to reduce the conditions that cause it to the smallest possible repro:

  1. Start with the failing scenario.
  2. Remove inputs, code paths, dependencies one at a time. Does the bug still happen?
  3. Keep going until removing any more makes the bug vanish.

You end up with a 5-line file that reproduces the bug deterministically. Now:

  • Reading the relevant code is trivially scoped.
  • Attaching a debugger is trivially scoped.
  • Filing a bug upstream takes 30 seconds.
  • Asking for help is much more likely to get a useful answer.

Small repros also become regression tests almost for free.

The 5 whys

Once you've found the bug, ask "why?" five times in a row. Each "why" should drive you past the surface cause:

  1. Why did the order fail to charge? — The payment intent had a null customer field.
  2. Why was customer null? — The customer-create call returned 200 with an empty body.
  3. Why did we treat that as success? — The HTTP wrapper returns the parsed body; an empty body parses to null.
  4. Why doesn't the wrapper distinguish "empty body" from "explicit null"? — It uses JSON.parse which can't tell the difference.
  5. Why did the API return an empty body for a successful create? — The API panicked on a downstream service timeout but the gateway converted the panic to 200.

Now you have five fixes (validate non-null, distinguish empty bodies, fix the gateway, add a test, update the runbook) instead of one ("add a null check").

This is the Toyota Production System technique. It's most powerful at incident postmortems but works anywhere.

Heisenbugs

Some bugs vanish when you observe them — the debugger changes timing enough to mask the race. For those:

  • Add printf debugging instead of breakpoints — usually doesn't perturb timing as much.
  • Use a record-and-replay tool like rr (Linux) or Time Travel Debugging (Windows). It records the execution; you replay deterministically afterward.
  • Increase contention deliberately. sleeps, more threads, slower machines often surface the race.
  • Read the code paths in question carefully. Concurrency bugs almost always have a logical explanation; the rarity is in the timing, not the cause.

The debugging hierarchy

Roughly the order of techniques to reach for:

  1. Read the error message. Sometimes the answer is right there.
  2. Make a smallest failing repro. This unblocks everything else.
  3. Print or step through. The repro should be small enough that either is fast.
  4. Bisect if it's a regression.
  5. Re-read the code with the bug in mind. New eyes often spot it instantly.
  6. Ask for help — but bring the smallest repro. "Here's a 5-line file that prints the wrong thing" is much better than "the deploy pipeline broke."

The pattern: invest in fast feedback (small repro) before any other technique.

What to internalise

  • Debugging is investigative, not heroic. Patience beats cleverness.
  • The smallest failing test case is the meta-technique. Reduce, then attack.
  • Print is fine. So is the debugger. So is bisect. Use them all.
  • "I have no idea what's happening" is often a sign you skipped step 1 — read the error message again, slowly.
  • Every bug you understand deeply earns you a regression test.

Tools in the wild

4 tools
  • git bisectfree tier

    Binary search across commits to find the one that introduced a regression.

    cli
  • rrfree tier

    Record-and-replay debugger for Linux. Determinism for non-deterministic bugs.

    cli
  • Sentryfree tier

    Error tracking with stack traces, breadcrumbs, and release tagging.

    service
  • Built-in debugger for any web frontend — breakpoints, network, performance.

    service