Design a URL Shortener
Prompt
A growth team at a marketing company wants to stop pasting raw campaign links into emails and billboards. They've asked you to build them a link service: "you give it a URL, it gives you a short link, and the short link sends people to the right place." That's the whole brief they handed you. Build it.
How this round runs
I've handed you a vague one-liner on purpose — there's a lot it doesn't say. You drive: pin down what this thing actually has to do and at what scale before you draw a single box, then design it, and I'll push you to go deep on one piece and to commit to a trade-off and own its cost.
Model answer
1. Requirements I'd surface first. The brief hides almost everything that shapes the design, so I'd ask:
- Read:write ratio? For a campaign-link product this is wildly read-heavy — say 10000:1. That single number tells me the redirect path is the hot path and writes are rare; the whole design bends toward fast, cacheable reads.
- Vanity / custom aliases? "Give it a URL, get a short link" is silent on
whether the growth team wants
go.co/spring-sale. If yes, the key space is no longer something I control — I need a uniqueness check on a user-supplied key. - Analytics? Marketing almost certainly wants click counts. That's a separate, write-heavy concern I'd keep off the redirect hot path.
- Custom expiry? Campaign links may need to die after the campaign. That adds a TTL/expiry field and a sweep.
- Scale + retention: how many links/day, kept for how long? Suppose 100M links/month, kept 5 years → ~6B rows. That sizes the key length and the store.
Non-functional: redirect latency p99 under ~50ms, redirects should survive a cache or DB hiccup, and a short code must never collide onto two different URLs.
2. High-level design (and why each piece).
client --> CDN/edge --> stateless API servers --> cache (Redis) --> durable KV store
| ^
key generator -------- write path -------------/
|
analytics queue --> async click counter (off hot path)
- Stateless API tier so I can scale redirects horizontally behind a load balancer — no per-node state to lose.
- Cache in front of the store because reads dominate 10000:1; the working set of "hot" links is tiny and lives happily in memory.
- Durable KV store (the code→URL map is a pure key lookup; I don't need joins), indexed on the short code.
- Key generator as its own concern — see the deep-dive.
- Analytics as an async queue, deliberately decoupled so a click never blocks or fails a redirect.
3. Deep-dive: key generation + storage. This is the hard part, so I'd go deep here. Two honest options:
- Hash the long URL (e.g. base62 of a truncated hash). Idempotent — the same URL maps to the same code, which dedups for free. Cost: collisions are real once the table is large, so every write needs a "does this code already point at a different URL?" check and a rehash-on-collision loop.
- Counter + base62 encode. A monotonic ID source (or a pre-allocated range per node so nodes don't fight over a global counter) encoded to base62. No collisions by construction. Cost: codes are sequential and guessable, which leaks volume and lets people enumerate links — a privacy problem for a marketing product.
For 6B links, base62 needs ceil(log62(6e9)) ≈ 6 characters (62^6 ≈ 56B), so a 7-char code gives comfortable headroom. Storage: ~6B rows × ~500 bytes ≈ 3TB before replication — beyond one node, so the store has to shard, by the short code itself (it's the natural partition key and the only lookup key on the hot path).
4. A committed trade-off and its cost. I'd commit to the counter +
per-node range, with the code generated by hashing the counter through a keyed
permutation so codes are non-sequential without collision-checking. The cost I'm
accepting, and I'll say it out loud: I give up the free deduplication that pure
URL-hashing gave me — the same long URL submitted twice produces two different
short codes and two rows. For a campaign-link tool that's fine (teams often want
distinct links per channel anyway), and I'd rather pay duplicate rows than pay a
collision-retry loop on every write. If dedup later becomes a hard requirement,
I'd add a secondary url_hash → code index and look it up before minting.
5. Operational concerns. Failure mode: the cache goes cold (restart or eviction storm) and every redirect falls through to the store, which buckles under 10000× its normal load. I'd detect it on a cache-hit-ratio alert and a store-QPS spike, and defend against it with request coalescing (collapse a thundering herd of misses for the same code into one store read) plus serving slightly-stale cached entries during a store brown-out. Rollback for a bad deploy: the API tier is stateless, so I roll back the image; the data is append-mostly so there's nothing to migrate back.
- Surfaced read:write ratio, vanity URLs, analytics, and expiry as requirements before drawing anything
- Justified each component (cache because reads dominate, async queue to keep clicks off the redirect path)
- Went deep on key generation, comparing hash vs counter with the actual cost of each
- Committed to a specific approach and named what it gives up (free dedup)
- Raised a cold-cache failure mode with detection and a defense, unprompted
- 'What breaks at 10x links/sec on writes?' → the global counter contends; pre-allocate ID ranges per node so writes don't coordinate
- 'Why a KV store and not Postgres?' → it's a pure key lookup with no joins; commit to KV, but note Postgres is fine until you outgrow one node
- 'A whole region's cache dies — now what?' → coalesce misses, serve stale, and rate-limit the store so a cold cache degrades instead of cascading
- 'Custom vanity aliases?' → user-supplied key means a uniqueness check on write and a reserved-words list