Monorepo Pipelines
Affected-only builds, build graphs, remote cache.
Monorepo Pipelines
A monorepo's value proposition is simple: one repo, many packages, atomic refactors across them. The cost — if you don't have the right pipeline tooling — is "every PR rebuilds everything", and CI minutes that grow quadratically with team size.
The rebuild-everything trap
The naive monorepo CI:
- run: pnpm install
- run: pnpm test # runs every test in every package
- run: pnpm build # builds every package
For 5 packages, that's annoying. For 200 packages, that's a 90-minute CI run on every PR. For 2,000 packages (Google, Meta scale), that's never.
The fix: build only the affected packages — the ones whose source actually changed, plus everything that transitively depends on them.
Build graphs — the core data structure
Every monorepo build tool models packages as a directed graph:
app-web ───▶ ui-lib ───▶ design-tokens
│ │ │
▼ ▼ ▼
api-client css-vars chroma-utils
│
▼
types-shared
Nodes are packages; edges are dependencies. When you change design-tokens, the affected set is design-tokens, ui-lib, app-web — every package that transitively depends on it. CI runs build/test only on that subgraph.
The tool computes this from a manifest in each package:
// package.json (Turborepo / pnpm workspaces)
{
"name": "ui-lib",
"dependencies": {
"design-tokens": "workspace:*"
}
}
Or in Bazel-style explicit-deps form:
# BUILD.bazel
py_library(
name = "ui_lib",
srcs = glob(["*.py"]),
deps = ["//design_tokens:design_tokens_lib"],
)
Bazel-style is stricter: you can't import from another package unless you declare the dep. This is annoying but catches a class of bugs where "it builds in CI but breaks locally" because the implicit dep wasn't reflected in the build graph.
The cache key
Each build tool computes a cache key per package. Roughly:
cache_key(pkg) = hash(
source_files(pkg),
cache_key(direct_dependencies) # recursive!
)
If the source hasn't changed and no dependency's cache key has changed, the cache key is the same — and the previous build output can be reused.
The "transitive" part is critical. If you change design-tokens, every package's cache key transitively changes — they all need to rebuild. If you change app-web, only app-web rebuilds; nothing depends on it.
$ turbo run build --filter=...[main]
Tasks: 23 successful, 23 total
Cached: 18 cached, 23 total ← 18 hits, 5 actual builds
Time: 12s
Remote cache — sharing builds across machines
The same hash that powers local affected-only also enables a remote cache. Build the package once on your laptop; CI re-uses your build via the remote cache.
turbo build (laptop) → write key K1 → output to remote cache
↓
turbo build (CI) → read key K1 → cache hit, no rebuild
↓
turbo build (other laptop) → read key K1 → cache hit
The cache stores build outputs (compiled code, test results, built containers). Pulling from a remote cache is an HTTP GET; if it's cheaper than the build itself (it almost always is), you save time.
Tools and their cache offerings:
- Turborepo: Remote Cache (Vercel-hosted, free up to a quota) or self-hosted via S3.
- Nx: Nx Cloud (hosted) or self-hosted.
- Bazel: Remote Build Execution (RBE) protocol; self-hosted (BuildBuddy, BuildGrid) or hosted.
- Pants: Native remote cache via the gRPC RBE protocol.
When affected-only fails
The mode collapses if any of these hold:
- Implicit deps: package A imports from B but doesn't declare it. A change in B doesn't invalidate A's cache → CI passes, prod breaks.
- Filesystem-side-effect deps: tests that read a fixture file that lives outside the package. A change to the fixture doesn't invalidate the test cache.
- Tool-version drift: the same source builds differently with a different compiler version. Hash the toolchain version into the cache key.
- External dependencies: an npm package republished under the same version. Lock files mitigate; you must enforce them.
The fix for all four is hermeticity — the build's inputs include EVERYTHING that influences the output. Bazel enforces this strictly; Turborepo and Nx are looser by default but improving.
A real monorepo CI
Here's a typical Turborepo CI:
name: ci
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with: { fetch-depth: 2 } # need parent commit for affected detection
- uses: pnpm/action-setup@v3
- run: pnpm install --frozen-lockfile
- name: Lint, build, test affected packages
env:
TURBO_TOKEN: ${{ secrets.TURBO_TOKEN }}
TURBO_TEAM: my-team
run: |
pnpm turbo run lint build test \
--filter=...[origin/main] \
--concurrency=10
Three things make this fast:
--filter=...[origin/main]— only run on packages that changed sincemain.TURBO_TOKENenables remote cache lookups → most tasks hit the cache.--concurrency=10— pack as many parallel tasks into the runner as it has CPUs.
For a typical PR touching 1-3 packages out of 100, this finishes in under 2 minutes. The same code in a "rebuild everything" monorepo would run for 30+ minutes.
When NOT to use a monorepo
Monorepo tooling has overhead. If:
- You have fewer than 10 packages.
- Packages release independently with different versioning policies.
- You don't need atomic cross-package refactors.
...you might be fine with multi-repo + a private package registry. The crossover point is somewhere around 20-30 packages or 10+ contributors.
Summary
- Build graphs make monorepo CI feasible. Without them, you rebuild everything.
- Affected-only runs CI on the subgraph that depends on changed files.
- Cache keys hash source + transitive deps; matches mean cache hits.
- Remote caches share build outputs across CI runs and laptops.
- Hermeticity (Bazel-style explicit deps) prevents "works in CI, breaks locally".
- For 20+ packages and 10+ developers, the tooling pays for itself in the first month.
Tools in the wild
5 tools- libraryTurborepofree tier
Vercel's lightweight monorepo orchestrator for JS/TS — incremental, remote-cacheable.
- libraryNxfree tier
Polyglot monorepo build system with affected-graph and a hosted remote cache.
- libraryBazelfree tier
Google's build system — language-agnostic, hermetic, remote-build-execution.
- libraryPantsfree tier
Python-first monorepo build system; modern incremental + remote caching.
- libraryBuck2free tier
Meta's open-source monorepo build system. Rust-rewritten, blazingly fast.