← Bank
Code Review

Slow SQL Query

Code ReviewMid~10m
sqloptimizationindexing

Prompt

This query is in a PR for the dashboard's "recent active users and their posts" panel. It renders on every dashboard load — call it a few hundred requests a minute at peak — and the author says it "runs fine locally." Review it. Would you approve, and how much does the hot-path framing change your answer?

SELECT u.id, u.name, p.id, p.title, p.created_at
FROM users u
JOIN posts p ON u.id = p.user_id
WHERE u.created_at > '2024-01-01';

How this round runs

Rank what you find by severity, factor in that this runs hundreds of times a minute, and give an approve / approve-with-follow-up / block call. Say what you'd ask for before trusting it in production.

Model answer

Approve only with a required follow-up before it ships to prod — the query is correct, but its performance profile is unverified on a hot path.

1 (highest — full table scan on the filter). There is almost certainly no index on users.created_at, so the planner scans the entire users table to find rows after the cutoff. "Runs fine locally" is the tell: a dev DB has a few thousand rows where a seq scan is instant; production has millions where the same scan is O(n) per request. At a few hundred req/min that scan runs hundreds of times a minute, so this is a real latency-and-load finding, not a micro-opt.

2 (join key). If posts.user_id is unindexed, the join degrades too — every qualifying user triggers a scan or an expensive hash build on posts. Both indexes matter; the filter one matters most.

Root cause is missing indexes on the columns the query filters and joins on — not the query text itself, which is fine. The diagnosis is the part I'd insist on before approving: don't add indexes on a hunch, run EXPLAIN ANALYZE on prod-like data and confirm a Seq Scan on users. Then:

CREATE INDEX idx_users_created_at ON users (created_at);
CREATE INDEX idx_posts_user_id ON posts (user_id);

A covering index users (created_at, id) lets the filter and the join-key read come from the index alone, avoiding heap lookups — worth it given the call frequency. Two caveats I'd state, not hide: every index slows writes and costs storage, and the indexes should be created CONCURRENTLY (Postgres) so building them doesn't lock users on a live table.

So: approve conditional on an EXPLAIN ANALYZE showing the scan, the two indexes added concurrently, and a before/after latency number. Without the hot-path framing I'd be more relaxed — for a once-a-day report a brief scan is tolerable; at hundreds of req/min it isn't, and that's what flips this from a nice-to-have to a merge blocker on the follow-up.

Signals — what a strong answer shows
  • Identified the seq scan on the WHERE column as the top finding and ranked it above the join-key index
  • Used the call frequency to set severity — same query, different verdict at hundreds/min vs once/day
  • Insisted on EXPLAIN ANALYZE to confirm the plan before adding indexes, rather than indexing on a hunch
  • Named the write/storage cost and CONCURRENTLY locking caveat — didn't present indexes as free
  • Concrete verdict: approve-with-required-follow-up, with the follow-up spelled out
Follow-ups — where it goes next
  • Would you approve as-is? → Not on a hot path; conditional on EXPLAIN-confirmed indexes and a before/after number.
  • Why a composite index over two single-column ones? → users(created_at, id) covers both the filter and the join read, avoiding heap lookups.
  • What would change your mind on severity? → If this ran rarely, or on a small/bounded table, the scan is tolerable and I'd not block.