Concurrency Models
Threads, event loops, async/await, actors, CSP — and when to pick each.
Concurrency Models
Concurrency is overlapping multiple tasks in time. Parallelism is actually running them simultaneously on separate cores. Most programs need concurrency; some also need parallelism. The model you pick determines cost per task, scheduling guarantees, and the kind of bugs you'll encounter.
Analogy
Think of a barista running a morning rush. Concurrency is one barista juggling seven drinks — grind, steam, pour, grind, steam — none of the drinks get their full attention but all of them progress. Parallelism is hiring a second barista with a second machine so two drinks can actually be pulling espresso at the exact same second. A single barista can be wildly concurrent without ever being parallel, and a two-person team can still deadlock if they both reach for the last jug of oat milk.
OS threads
The kernel schedules threads onto cores. With N cores and N CPU-bound threads you get real parallelism. Each thread owns a stack (typically 1–2 MB by default), and synchronisation goes through locks, mutexes, and condition variables.
pthread_t t;
pthread_create(&t, NULL, worker, arg);
pthread_join(t, NULL);
Strengths: real parallelism, straightforward mental model. Weaknesses: threads are expensive (thousands is borderline, millions is impossible), context switches cost microseconds, and lock-heavy code is hard to reason about.
When to use: CPU-bound workloads with a small number of long-running tasks — video encoding, scientific compute, a database engine's worker pool.
Event loops
One thread. A big loop that asks the kernel (epoll on Linux, kqueue on BSD/macOS, IOCP on Windows) "which of my file descriptors is ready?" and runs the handler. Node.js, nginx, redis-server are all event-loop-based.
loop:
events = epoll_wait(fds)
for event in events:
callback(event)
Each connection is just a socket + a callback. Ten thousand idle websockets cost nearly nothing. Strengths: scales to enormous numbers of idle connections; no locks because there's one thread. Weaknesses: any blocking call on that thread freezes the server; you can't use more than one core without offloading (worker threads, cluster).
When to use: I/O-bound services with huge fanout — chat servers, proxies, real-time dashboards, front-door gateways.
async / await
Syntactic sugar for "suspend this function until I/O is ready, resume on the same or another thread." Under the hood, the compiler rewrites the function into a state machine. Rust, Python, C#, JavaScript, Swift, Kotlin all have it.
async function handle(req: Request) {
const user = await db.getUser(req.userId);
const posts = await db.getPostsByUser(user.id);
return { user, posts };
}
The code reads sequentially; the runtime interleaves many such calls on a small OS-thread pool. Strengths: ergonomics of synchronous code with the throughput of an event loop; usually multi-threaded underneath so you get some parallelism. Weaknesses: function colouring (async functions can't be called from sync ones), harder debugging when you lose a stack trace across an await.
When to use: general-purpose I/O services, especially where you have a mix of fanout widths and want readable code.
Actor model
Each unit of work is an actor — a lightweight process with its own mailbox. Actors never share memory; they communicate by sending messages. Erlang, Elixir, and Akka on the JVM are the reference implementations.
Pid = spawn(fun() -> loop([]) end),
Pid ! {set, 1}.
Strengths: total isolation. An actor crashing takes nothing else down. Supervision trees let you design explicit failure semantics ("let it crash"). Weaknesses: message-passing overhead; shared read-mostly state is awkward.
When to use: systems with natural message boundaries, high availability requirements, and heavy shared mutable state that would otherwise need aggressive locking.
CSP — Go channels
Communicating Sequential Processes: goroutines (millions of them, ~2 KB stacks that grow as needed) pass values through typed channels. The runtime multiplexes them onto a small pool of OS threads.
ch := make(chan int)
go produce(ch)
for v := range ch {
process(v)
}
Strengths: very lightweight; the select statement is a clean way to multiplex. Weaknesses: deadlocks are easy if you're sloppy about channel ownership; shared mutation still needs sync.Mutex.
When to use: anything where "thousands of cheap concurrent tasks" is the right abstraction — crawlers, pipelines, streaming aggregators.
Picking a model
A rough decision tree:
- Heavy shared mutable state? → actors.
- Huge fanout of I/O? → event loop or async/await.
- Few, long-running CPU tasks? → OS threads.
- Many small CPU tasks? → CSP goroutines (or thread pools if the language lacks them).
- Mixed workload, nothing extreme? → async/await is the modern default.
Cost per task
| Model | Per-task cost | Max tasks (rule of thumb) |
|---|---|---|
| OS threads | ~1 MB stack | ~10k |
| Event-loop callback | ~1 KB | 100k+ |
| async/await future | ~hundreds of bytes | 100k+ |
| Actor (Erlang) | ~2 KB | millions |
| Goroutine | ~2 KB growing | millions |
Concurrency is a tool for structure; parallelism is a tool for speed. Pick the concurrency model that matches the workload shape, then worry about cores.