Shortest Path (Dijkstra)
Greedy relaxation over weighted graphs — the OSPF core.
Shortest Path (Dijkstra)
You have a map of roads between cities, each with a known travel time. How long does it take to drive from A to Z if you always take the fastest route? That's the shortest-path problem on a weighted graph — and the algorithm that owns it is Dijkstra. It's how OSPF routes packets across the internet, how Google Maps plans your commute, and how game AI finds its way around obstacles.
Dijkstra scooped this in 1959 on the back of an envelope while waiting for coffee (his own telling). The algorithm he sketched is exactly what's still running inside Cisco routers six decades later — slightly smarter data structures, same greedy skeleton.
Weighted graph, shortest what?
A graph is a set of nodes (vertices) connected by edges. A weighted graph attaches a non-negative number to each edge — a distance, a time, a cost, a latency. The "shortest path" from s to t is the path whose total edge-weight sum is minimised.
BFS (breadth-first search) solves the unweighted version — it finds the fewest-hops path. But on a weighted graph, BFS gives wrong answers. Consider:
A --1-- B --1-- C
\ /
\---- 10 --/
BFS from A finds C in 2 hops via A→B→C. The weighted shortest path is also A→B→C, but the hop count doesn't help you compare it against the direct A→C edge of weight 10. On a bigger example — A-to-X via three cheap edges versus A-to-X via one expensive edge — BFS picks the shorter hop count and the wrong answer. BFS implicitly assumes every edge weight is 1. Once they aren't, you need Dijkstra.
The greedy invariant
Dijkstra maintains a tentative-distance table: dist[v] is "the shortest path from the start to v that I've found so far." Initially dist[start] = 0 and everything else is ∞.
The loop is almost embarrassingly simple:
- Pick the unvisited node with the smallest
dist[v]. Mark it visited. - For every edge
(u, v, w)out of it, relax: ifdist[u] + w < dist[v], setdist[v] = dist[u] + w. - Repeat until every reachable node is visited.
The magic is step 1. The moment you mark a node visited, its dist is final — no future relaxation can improve it. Why? Because every unvisited node has a tentative distance at least as large as the one you just picked, and all edge weights are non-negative — so going through any of them can only produce a longer path. Greedy works here because the graph's structure plus non-negative weights enforce a consistent ordering.
Worked example
Same graph, four nodes, start at A:
1 2
A ---- B ---- C
| \ |
|5 \2 |1
| \ |
D ---------- (implicit)
Edges: A-B (1), B-C (2), C-D (1), A-D (5), B-D (2).
| step | visited | dist[A] | dist[B] | dist[C] | dist[D] | note |
|---|---|---|---|---|---|---|
| init | 0 | ∞ | ∞ | ∞ | seed A = 0 | |
| 1 | A | 0 | 1 | ∞ | 5 | relax A→B, A→D |
| 2 | A, B | 0 | 1 | 3 | 3 | relax B→C, B→D (improved from 5 to 3) |
| 3 | A, B, C | 0 | 1 | 3 | 3 | relax C→D: 4 > 3, no improvement |
| 4 | all | 0 | 1 | 3 | 3 | done |
Notice step 2 — relaxing B→D improved D from 5 down to 3. That's the whole point. Dijkstra discovers that the cheap route through B is better than the direct A→D edge. And in step 3, relaxing C→D doesn't help because 3 + 1 = 4 is worse than the 3 we already have. Both kinds of frames are informative; the playground shows both.
Why a priority queue matters
The naive "scan the table for the minimum" takes O(V) per iteration. Over V iterations, that's O(V²) total — fine for tiny graphs, painful at scale.
Using a min-heap keyed on tentative distance, the "pick the nearest" step becomes O(log V). Each edge relaxation can push or decrease-key in O(log V). Total: O((V + E) log V), which is the standard Dijkstra complexity you'll see in textbooks.
A gotcha: most binary heaps (including Python's heapq and most JS priority-queue libs) don't support an efficient decrease-key operation. The common workaround is to push duplicates — every time you relax an edge, push a new (newDist, v) entry — and skip stale pops with an if d > dist[v]: continue check. A Fibonacci heap gives true amortised O(1) decrease-key, but almost nobody actually uses one in practice.
Dijkstra fails on negative weights
The greedy invariant ("the nearest unfinalized node has its final distance settled") breaks the moment edges can be negative. Consider:
A --1--> B --(-3)--> C
\ ^
\------ 2 --------/
From A: Dijkstra visits C directly via A-C (cost 2), calls it final, done. But A→B→C costs 1 + (-3) = -2, which is better! Dijkstra misses it because it finalised C before ever looking at B.
For graphs with negative edges, use Bellman-Ford (O(VE), handles negative edges, detects negative cycles) or its practical cousin SPFA. Negative edges are rare in "physical distance" problems but appear in optimisation problems, arbitrage detection, and anything where an edge weight represents a gain rather than a cost.
Where this shows up
- Routing protocols. OSPF (RFC 2328) and IS-IS build link-state databases and run Dijkstra per router to compute the forwarding table. BGP uses a different approach (path-vector) because it has to think about policy, not just hop cost.
- Navigation. Google Maps, Waze, Apple Maps, every turn-by-turn navigator is Dijkstra at its core — usually augmented into A*, which is just Dijkstra plus a heuristic estimate of remaining distance. A* finds the same optimal path Dijkstra would but visits fewer nodes when the heuristic is well-calibrated.
- Game AI pathfinding. Same story as navigation — grid-based A* for 2D games, nav-mesh A* for 3D. The heuristic is usually Manhattan or Euclidean distance to the goal.
- Dependency solvers. "Smallest set of packages to upgrade to satisfy this requirement" maps naturally onto a weighted shortest-path problem if you model transitive deps right.
- Network latency SLAs. Pick the lowest-latency path through a provider's POPs; each edge weight is the measured P50 RTT.
Gotchas worth remembering
A handful of things go wrong in practice even once you have the algorithm right:
- Mutable edge weights. If your weights change while the algorithm runs (e.g. traffic updates mid-route), you have to either freeze a snapshot or use a dynamic shortest-path algorithm. Stock Dijkstra assumes static weights.
- Multi-source or multi-target. Run Dijkstra from a virtual source connected to all real sources with zero-weight edges — that's one Dijkstra instead of N. For all-pairs shortest paths on dense graphs, reach for Floyd-Warshall at O(V³) instead.
- The heap has duplicates. Any decrease-key-free heap implementation will accumulate stale entries. Memory-bound on pathological inputs; always pair with the stale-check pop pattern.
Playground
Pick a graph. Pick a start node. Press Play, watch the distance table update one relaxation at a time. The current node is highlighted in var(--color-accent), the visited set fades to var(--color-success), and the frontier (unvisited nodes with a finite tentative distance) shows up in the sidebar sorted by distance — that's the priority queue the real implementation uses.
Visualizer
The Dijkstra Animator panel draws the graph as an SVG node-and-edge diagram. Nodes are labeled with their current tentative distance. Edges are labeled with their weight. When the algorithm relaxes an edge, it highlights in the accent colour for one frame; when a node becomes visited, it turns green. Step back and forth to replay any moment in the run.