Pipeline Anatomy
A DAG of stages, artifacts, and edges.
Pipeline Anatomy
A CI pipeline is a directed acyclic graph (DAG) of jobs. Each node is a job; each edge is a dependency. Jobs without incoming edges run first, in parallel. Jobs with dependencies wait for their upstream nodes to turn green.
Every modern CI system — GitHub Actions, GitLab CI, CircleCI, Buildkite — is a variation on this pattern. The names differ; the graph does not.
Analogy
A pipeline is a car assembly line. Some stations can work in parallel — the doors are being painted over here while the seats are being stitched over there — but the painted door cannot go on until the frame is welded, and the seats cannot go in until the chassis is through the paint booth. The total time to finish a car isn't the sum of every station; it's the longest unbroken chain from raw steel to driving off the lot. Artifacts are the parts crates passed between stations. An "allow failure" station is a QA inspector who waves cars through even when they fail — and after a month, nobody trusts the final inspection either.
The canonical stages
Most pipelines have five logical stages:
| Stage | What happens | Typical duration |
|---|---|---|
| checkout | Clone the repo at the triggering commit | 2–5 s |
| install | Restore or download dependencies | 5–60 s (with cache) |
| build | Compile, bundle, generate artifacts | 10–120 s |
| test | Run the test suite (unit, integration, e2e) | 30 s – 10 min |
| deploy | Ship the artifact to an environment | 10 s – 5 min |
These stages are not fixed. A monorepo might fan out into a dozen build jobs after install. A library project might skip deploy entirely. The shape is determined by what the project needs.
Dependencies determine the critical path
The total wall-clock time of a pipeline is not the sum of all job durations. It is the longest path through the graph — the critical path.
If build takes 90 seconds and three parallel test shards each take 45 seconds, the critical path is checkout → install → build → shard = 90 + 45 = 135 seconds plus checkout and install. Adding a fourth shard does nothing to the wall time; removing the build bottleneck does.
Understanding the graph lets you reason about where to invest optimisation effort.
Artifacts
Jobs communicate via artifacts: files persisted between stages. The build job produces a compiled bundle; the deploy job consumes it. Without explicit artifact upload and download steps, each job starts from a blank workspace and has no memory of upstream work.
Artifacts cost storage and time to transfer. Be intentional: upload what downstream jobs genuinely need, nothing else.
Failure propagation
When a job fails, all downstream dependents are cancelled or skipped. A checkout failure stops everything. A test failure typically stops deploy. This is correct behaviour: you want the pipeline to stop shipping broken code.
Some jobs are marked continue-on-error or allow_failure. Use this sparingly. A test job marked allow_failure is a test job that no longer enforces quality. The signal decays.
YAML is configuration, not code
Pipeline definitions are declarative YAML (or similar). You describe what should happen, not how the runner executes it. The CI platform reads the YAML, constructs the graph, and schedules jobs onto available runners.
# GitHub Actions — minimal example
jobs:
install:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pnpm install
build:
needs: install
runs-on: ubuntu-latest
steps:
- run: pnpm build
test:
needs: build
runs-on: ubuntu-latest
steps:
- run: pnpm test
The needs key is the edge in the graph. install has no needs, so it runs immediately. build waits for install. test waits for build.
The hidden job: waiting for a runner
Elapsed time on CI includes queue time — waiting for a runner to become available. This is invisible in most timing dashboards. On a heavily loaded shared runner pool, a 3-minute pipeline can take 12 minutes wall time. Keep this in mind when evaluating "our pipeline is slow."
What to build next
Once you understand the graph, two levers dominate optimisation:
- Parallelism — split work across jobs that can run concurrently.
- Caching — avoid redoing expensive work when inputs haven't changed.
The next level covers caching in depth.
Tools in the wild
6 tools- serviceGitHub Actionsfree tier
YAML workflows triggered by repo events; the dominant CI for OSS and most startups.
- serviceGitLab CI/CDfree tier
First-class pipelines built into GitLab — DAG jobs, parent/child pipelines, manual gates.
- service
Mature CI with strong macOS + ARM runner support and reusable orbs.
- service
SaaS control plane + your own runners — popular at companies with custom hardware needs.
- libraryJenkinsfree tier
Self-hosted CI workhorse; Jenkinsfile DSL still runs huge enterprise pipelines.
- libraryDaggerfree tier
Code your pipelines in Go/Python/TS and run them locally or in any CI provider.