Intermediateany~90 min

🚦 Rate Limiter

A library-shaped project: write a small rate-limiter in the language of your choice (Python / TS / Go / Rust). Implement three algorithms — fixed window, sliding window, token bucket — and write tests that show how they differ under burst traffic.

You're going to build a small rate-limiting library from scratch. Pick the language you code in most — Python, TypeScript, Go, Rust, or another. We show TypeScript + Vitest as the worked path, but the three algorithms are language-agnostic.

A rate limiter is a guard that says "yes, you can make that request" or "no, wait, you've already made too many." They live in APIs, authentication systems, and real-time services. By the end you'll have implemented three fundamentally different strategies and understand when to reach for each.

We're not racing. Each step is one idea. If a hint helps, take it — there's no penalty.

Before you start

The walkthrough

Create the project file and first test

In your terminal, inside your project folder, create a file for the rate-limiter code.

For TypeScript: limiter.ts
For Python: limiter.py
For Go: limiter.go
For Rust: edit src/lib.rs

Then create a test file:

For TypeScript: limiter.test.ts
For Python: test_limiter.py
For Go: limiter_test.go in the same folder
For Rust: add a test module at the bottom of src/lib.rs

In your test file, write one placeholder test that passes — something like expect(1 + 1).toBe(2) in TypeScript or assert 1 + 1 == 2 in Python. Run it to confirm the toolchain is wired up: npx vitest run (TS), pytest (Python), go test (Go), or cargo test (Rust).

After this step you should see…
1 passed

Define the Limiter interface

A rate limiter answers one question: "can I do this action now?" It needs to remember some state per key (e.g., per IP address or user ID) and check whether allowing the action would exceed a limit.

Define a function or type signature for the limiter's core method:

allow(key: string, now: number) -> boolean

It returns true if the action is allowed, false if it's been rate-limited.

Why now as a parameter? Because we'll write tests that fake time — passing in now lets us test "what happens 30 seconds later" without actually waiting.

After this step your file should look like… (limiter.ts)

Store window boundaries and request count

In a fixed-window rate limiter, we divide time into buckets. If the limit is 10 requests per 60 seconds, then seconds 0–59 is one bucket, 60–119 is the next, and so on.

When a request comes in at time now:

  1. Calculate which window it belongs to: windowNumber = floor(now / windowMs)
  2. Check the store: have we seen this key + window before? If not, it's a fresh window.
  3. If it's a fresh window, reset the counter. If it's the same window, increment.

Implement the logic to store and retrieve windowStart and count for each key. Don't implement the allow/deny decision yet — just make sure you can compute which window a request falls into and remember the count.

After this step you should see…
0

Implement the allow decision logic

Now add the decision: return true if the count is still under the limit, false if it would exceed it.

The simplest version: increment first, then check if you've gone over. If you have, return false; otherwise true.

After this step you should see…
true

Write tests for fixed-window — under limit, at limit, after window

Write three test cases:

  1. Under limit: Call allow three times with the same key within the same window. All should return true.
  2. At limit: With a limit of 3, the fourth call should return false.
  3. After window: Call allow for a key, then call it again at a time past the window boundary. The counter should reset and return true again.
After this step you should see…
3 passed

With a 60-second window and limit 5, after making 5 requests at second 0, what happens if you call allow() again at second 30?

Implement sliding window — keep a queue of request timestamps

Fixed window has a problem: if the limit is 10 per 60 seconds and someone makes 10 requests at second 59, then 10 more at second 60, they've made 20 in 2 seconds — well over the intended rate.

Sliding window fixes this: instead of resetting at fixed boundaries, remember the exact time of each request for the past windowMs milliseconds. When a new request arrives, throw away the old ones and count what's left in the window.

Implement a sliding-window limiter:

  1. For each key, keep a list of request timestamps.
  2. When allow is called with now, remove all timestamps older than now - windowMs.
  3. Check if the remaining count is under the limit. If so, add now to the list and return true.

Test it with the same three cases: under limit, at limit, and after window. Sliding window should pass all three.

After this step you should see…
3 passed

Implement token bucket — refill tokens at a rate

Token bucket is different from both: imagine a bucket with tokens in it. Each request consumes one token. Tokens refill at a steady rate (e.g., 10 per second). If the bucket fills, it overflows — extra tokens are wasted.

Implement token bucket:

  1. For each key, store how many tokens are in the bucket and when it was last refilled.
  2. When allow is called, calculate how many tokens have accrued since the last refill: time_since_refill * refill_rate.
  3. Add those to the bucket (capped at the bucket capacity).
  4. If there's at least 1 token, consume it and return true. Otherwise return false.

Use a refill rate of limit_per_window / window_ms tokens per millisecond. For example, a limit of 10 per 1000ms = 0.01 tokens per millisecond.

Test with the same three cases. For the "after window" case, call allow at a time far enough in the future that the bucket has refilled.

After this step you should see…
3 passed

Add a comparison table showing when each algorithm wins

You now have three algorithms. Add a comment or docstring in your code (or in a README) that documents when to use each:

Algorithm Pros Cons Best for
Fixed Window Simple, low memory, fast Allows burst at window edges Simple services, known predictable patterns
Sliding Window Perfectly fair, no burst gaps Higher memory (keeps all timestamps) Critical APIs, strict rate enforcement
Token Bucket Handles bursts, smooth rate Slightly complex, memory per key Bursty traffic, backpressure systems

Add a comment or docstring explaining this in your limiter file. You don't need to write tests for this — it's documentation, not code.

After this step you should see…
0

Test edge cases: boundary times, key cardinality, clock skew

Rate limiters live in the real world, where clocks are weird. Write three more test cases:

  1. Exact boundary: With a 1000ms window, make requests at exactly times 0 and 1000. The counter should reset between them (they're in different windows).
  2. High key cardinality: Create 1000 different keys and call allow on each once. All should return true (each has its own limit).
  3. Clock backward (gotcha): Call allow(key, 1000), then call it again with allow(key, 500) — a time in the past. Your limiter should handle this gracefully (either ignore, or reset, but don't crash or leak memory).
After this step you should see…
3 passed

Create a clean public API — factory function returning the interface

You have three limiter implementations. Create a unified public API so users can pick one without knowing the internals:

createLimiter(algorithm, limitPerWindow, windowMs) -> {allow(key, now): boolean}

The algorithm parameter is a string: "fixed-window", "sliding-window", or "token-bucket". The factory returns a single object with an allow method.

Test that all three algorithms are accessible via the factory.

After this step you should see…
3 passed

You did it

Run your full test suite one more time: npx vitest run, pytest, go test, or cargo test. You should see 12+ tests passing across fixed-window, sliding-window, token-bucket, and edge cases.

You built:

  • A data structure to track per-key rate-limit state
  • Three fundamentally different algorithms (fixed window, sliding window, token bucket)
  • An understanding of the tradeoffs between them
  • Tests that confirm behavior under normal and edge cases
  • A unified public API that hides implementation details

Every one of those is a pattern you'll see again in distributed systems, API design, and production services.

Stretch goals