Image Layers & Build
Layer caching, multi-stage, distroless, and BuildKit.
Image Layers & Build
A Docker image is a stack of read-only filesystem snapshots. Each instruction in your Dockerfile creates one. Building well is the discipline of keeping the slow, expensive layers cached and the cheap layers near the top.
Analogy
A Dockerfile is like a recipe shared across a kitchen: the bottom of the pan is the cuisine (base image), then the broth (system deps), then the proteins (language runtime), then the vegetables (your dependencies), and finally the seasoning (your source code, which changes every meal). If you change the seasoning, you don't redo the whole pan — but if you swap the broth, every layer above it has to be remade. The right order saves time. The wrong order makes every change expensive.
What a layer is
Every FROM, COPY, ADD, RUN, WORKDIR creates a new immutable layer. ENV, LABEL, EXPOSE, CMD, ENTRYPOINT add metadata only — no new layer.
FROM node:20-alpine # layer 1 (the base, ~50 MB)
WORKDIR /app # layer 2 (just sets working dir)
COPY package*.json ./ # layer 3 (manifest, tiny)
RUN npm ci # layer 4 (node_modules, large)
COPY . . # layer 5 (your source)
CMD ["node", "server.js"] # metadata, no layer
Five filesystem layers. Each one is content-addressed by SHA — meaning if your package.json is the same as last build, layer 3 has the same hash, and Docker reuses the cached layer 4 too. Saves you the npm ci cost.
Layer cache invalidation
Cache invalidation is simple: if an instruction changes, its layer and every layer after it is rebuilt.
So the rule is: put rarely-changing instructions early; put frequently-changing instructions last.
The classic dev-loop killer:
# WRONG — your source changes every commit, invalidating the deps layer
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci
CMD ["node", "server.js"]
A one-character source edit re-runs npm ci every time. Fix it by copying the manifest first:
# RIGHT — deps layer cached, only source layer rebuilds
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["node", "server.js"]
Now npm ci only runs when package.json changes. Daily dev rebuilds are 5x faster.
Multi-stage builds
The "build tools" stay in your image only because of how RUN accumulates. Compilers, build dependencies, source code — all of it ends up in the final image. Multi-stage builds break this:
# Stage 1 — build with all the tools.
FROM golang:1.22 AS builder
WORKDIR /src
COPY . .
RUN go build -o /out/app ./cmd/app
# Stage 2 — start fresh, copy only the artefact.
FROM gcr.io/distroless/static-debian12
COPY --from=builder /out/app /app
ENTRYPOINT ["/app"]
The final image contains only the binary. The Go toolchain (~600 MB) is gone. The source is gone. The intermediate object files are gone. You ship a 10 MB image instead of a 700 MB one.
You can have many stages, name them, and copy between them:
FROM node:20 AS deps
COPY package*.json ./
RUN npm ci
FROM node:20 AS builder
COPY --from=deps /node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:20-alpine AS runtime
COPY --from=builder /app/dist ./dist
COPY --from=deps /node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]
Distroless and FROM scratch
The smaller the runtime, the less attack surface, the faster the deploy.
| Base | Size | Has shell? | Use case |
|---|---|---|---|
ubuntu:22.04 |
~75 MB | yes | "I need apt to debug", lazy |
node:20-alpine |
~50 MB | yes (ash) | Default for Node, decent |
gcr.io/distroless/nodejs20-debian12 |
~75 MB | NO | Production Node — no shell to exploit |
gcr.io/distroless/static-debian12 |
~2 MB | NO | Static-linked Go/Rust binary |
scratch |
0 bytes | NO | Static binary; you provide everything |
Distroless images strip everything that's not the language runtime — no shell, no apt, no package manager. Smaller, harder to compromise, but harder to debug interactively (no kubectl exec and shell around).
FROM scratch starts with literally nothing. Pair it with a static binary (Go is the typical case) and you get images measured in single-digit MB:
FROM golang:1.22 AS build
COPY . .
RUN CGO_ENABLED=0 go build -o app .
FROM scratch
COPY --from=build /go/app /app
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
ENTRYPOINT ["/app"]
Note: copying ca-certificates.crt is mandatory if your binary makes HTTPS calls. There's no apt install ca-certificates to fall back on.
BuildKit — the modern builder
Default since Docker 23. The headline features:
Cache mounts
Persistent caches between builds, keyed by mount path. Saves redownloading dependencies even when the layer cache is invalidated.
# syntax=docker/dockerfile:1.7
RUN --mount=type=cache,target=/root/.npm npm ci
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
RUN --mount=type=cache,target=/go/pkg/mod go mod download
A fresh container build still runs npm ci, but the package downloads come from the cached /root/.npm mount. Order-of-magnitude faster on cold caches.
Secret mounts
Pass a secret to a build step without baking it into a layer:
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm ci
Run with docker build --secret id=npmrc,src=$HOME/.npmrc . — the secret is available during the RUN, gone after, never appears in any layer.
Parallel stages
BuildKit walks the DAG and builds independent stages in parallel. Two FROM stages with no inter-dependencies build at the same time.
Multi-platform
docker buildx build --platform linux/amd64,linux/arm64 -t myapp .
One command, two architectures, both pushed to a single multi-arch tag. Essential for Apple Silicon dev → Intel cloud deploy.
Common bugs
Cleanup in a separate RUN. This doesn't help:
RUN apt-get update && apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/* # too late — previous layer already cached the bytes
The cleanup needs to be in the same RUN:
RUN apt-get update \
&& apt-get install -y curl \
&& rm -rf /var/lib/apt/lists/*
The image is built layer by layer; deleting bytes in a later layer doesn't shrink an earlier one.
COPY . . before deps install. Re-runs your dependency install on every source change. Always copy the manifest first.
Single-stage build with build tools. Shipping a 1 GB image because the compiler is in there. Use multi-stage.
Pinning by tag, not digest. FROM node:20 is a moving target. For reproducible builds, pin by digest:
FROM node:20-alpine@sha256:abc123...
CI tools like Renovate / Dependabot can keep digest pins fresh.
No .dockerignore. Without it, COPY . . ships your node_modules, .git, .env, and IDE configs into the image. Always include node_modules, .git, .env*, dist/ in .dockerignore.
Inspecting your image
dive is the go-to TUI for inspecting layers — what got added in each, where the bloat lives, what files are duplicated:
dive my-app:latest
docker history is the simpler version — shows layer-by-layer size and creation command:
docker history my-app:latest
If a layer looks unexpectedly large, dive will tell you which file bloated it. Often it's a forgotten .git or node_modules snuck in by COPY . ..
A reproducible-build checklist
- Pin your base image to a digest, not a moving tag.
- Use
.dockerignoreaggressively; it's a one-line bug fix you'll thank yourself for. - Multi-stage to strip build tools.
- BuildKit cache mounts for package managers.
- Layer order: base → system deps → language runtime → app deps → source.
- Run as non-root.
USER node,USER nobody,USER 65532. - Scan with Trivy in CI before pushing.
Hit those and your images will be smaller, faster, and safer than 90% of what's published to Docker Hub.
Tools in the wild
5 tools- libraryBuildKitfree tier
Modern Docker builder — parallel stages, cache mounts, secret mounts.
- clidivefree tier
TUI for inspecting image layers; pinpoints which COPY blew up your image size.
- libraryDistrolessfree tier
Google's minimal runtime images — no shell, no apt, just glibc + runtime.
- cliTrivyfree tier
Aqua's vulnerability scanner; runs in CI in seconds against any image.
- clibuildx / bakefree tier
Multi-platform, multi-target builds defined in HCL/JSON.