containers · level 6

Image Layers & Build

Layer caching, multi-stage, distroless, and BuildKit.

200 XP

Image Layers & Build

A Docker image is a stack of read-only filesystem snapshots. Each instruction in your Dockerfile creates one. Building well is the discipline of keeping the slow, expensive layers cached and the cheap layers near the top.

Analogy

A Dockerfile is like a recipe shared across a kitchen: the bottom of the pan is the cuisine (base image), then the broth (system deps), then the proteins (language runtime), then the vegetables (your dependencies), and finally the seasoning (your source code, which changes every meal). If you change the seasoning, you don't redo the whole pan — but if you swap the broth, every layer above it has to be remade. The right order saves time. The wrong order makes every change expensive.

What a layer is

Every FROM, COPY, ADD, RUN, WORKDIR creates a new immutable layer. ENV, LABEL, EXPOSE, CMD, ENTRYPOINT add metadata only — no new layer.

FROM node:20-alpine        # layer 1 (the base, ~50 MB)
WORKDIR /app               # layer 2 (just sets working dir)
COPY package*.json ./      # layer 3 (manifest, tiny)
RUN npm ci                 # layer 4 (node_modules, large)
COPY . .                   # layer 5 (your source)
CMD ["node", "server.js"]  # metadata, no layer

Five filesystem layers. Each one is content-addressed by SHA — meaning if your package.json is the same as last build, layer 3 has the same hash, and Docker reuses the cached layer 4 too. Saves you the npm ci cost.

Layer cache invalidation

Cache invalidation is simple: if an instruction changes, its layer and every layer after it is rebuilt.

So the rule is: put rarely-changing instructions early; put frequently-changing instructions last.

The classic dev-loop killer:

# WRONG — your source changes every commit, invalidating the deps layer
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci
CMD ["node", "server.js"]

A one-character source edit re-runs npm ci every time. Fix it by copying the manifest first:

# RIGHT — deps layer cached, only source layer rebuilds
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["node", "server.js"]

Now npm ci only runs when package.json changes. Daily dev rebuilds are 5x faster.

Multi-stage builds

The "build tools" stay in your image only because of how RUN accumulates. Compilers, build dependencies, source code — all of it ends up in the final image. Multi-stage builds break this:

# Stage 1 — build with all the tools.
FROM golang:1.22 AS builder
WORKDIR /src
COPY . .
RUN go build -o /out/app ./cmd/app

# Stage 2 — start fresh, copy only the artefact.
FROM gcr.io/distroless/static-debian12
COPY --from=builder /out/app /app
ENTRYPOINT ["/app"]

The final image contains only the binary. The Go toolchain (~600 MB) is gone. The source is gone. The intermediate object files are gone. You ship a 10 MB image instead of a 700 MB one.

You can have many stages, name them, and copy between them:

FROM node:20 AS deps
COPY package*.json ./
RUN npm ci

FROM node:20 AS builder
COPY --from=deps /node_modules ./node_modules
COPY . .
RUN npm run build

FROM node:20-alpine AS runtime
COPY --from=builder /app/dist ./dist
COPY --from=deps /node_modules ./node_modules
USER node
CMD ["node", "dist/server.js"]

Distroless and FROM scratch

The smaller the runtime, the less attack surface, the faster the deploy.

Base Size Has shell? Use case
ubuntu:22.04 ~75 MB yes "I need apt to debug", lazy
node:20-alpine ~50 MB yes (ash) Default for Node, decent
gcr.io/distroless/nodejs20-debian12 ~75 MB NO Production Node — no shell to exploit
gcr.io/distroless/static-debian12 ~2 MB NO Static-linked Go/Rust binary
scratch 0 bytes NO Static binary; you provide everything

Distroless images strip everything that's not the language runtime — no shell, no apt, no package manager. Smaller, harder to compromise, but harder to debug interactively (no kubectl exec and shell around).

FROM scratch starts with literally nothing. Pair it with a static binary (Go is the typical case) and you get images measured in single-digit MB:

FROM golang:1.22 AS build
COPY . .
RUN CGO_ENABLED=0 go build -o app .

FROM scratch
COPY --from=build /go/app /app
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
ENTRYPOINT ["/app"]

Note: copying ca-certificates.crt is mandatory if your binary makes HTTPS calls. There's no apt install ca-certificates to fall back on.

BuildKit — the modern builder

Default since Docker 23. The headline features:

Cache mounts

Persistent caches between builds, keyed by mount path. Saves redownloading dependencies even when the layer cache is invalidated.

# syntax=docker/dockerfile:1.7
RUN --mount=type=cache,target=/root/.npm npm ci
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
RUN --mount=type=cache,target=/go/pkg/mod go mod download

A fresh container build still runs npm ci, but the package downloads come from the cached /root/.npm mount. Order-of-magnitude faster on cold caches.

Secret mounts

Pass a secret to a build step without baking it into a layer:

RUN --mount=type=secret,id=npmrc,target=/root/.npmrc npm ci

Run with docker build --secret id=npmrc,src=$HOME/.npmrc . — the secret is available during the RUN, gone after, never appears in any layer.

Parallel stages

BuildKit walks the DAG and builds independent stages in parallel. Two FROM stages with no inter-dependencies build at the same time.

Multi-platform

docker buildx build --platform linux/amd64,linux/arm64 -t myapp .

One command, two architectures, both pushed to a single multi-arch tag. Essential for Apple Silicon dev → Intel cloud deploy.

Common bugs

Cleanup in a separate RUN. This doesn't help:

RUN apt-get update && apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*    # too late — previous layer already cached the bytes

The cleanup needs to be in the same RUN:

RUN apt-get update \
    && apt-get install -y curl \
    && rm -rf /var/lib/apt/lists/*

The image is built layer by layer; deleting bytes in a later layer doesn't shrink an earlier one.

COPY . . before deps install. Re-runs your dependency install on every source change. Always copy the manifest first.

Single-stage build with build tools. Shipping a 1 GB image because the compiler is in there. Use multi-stage.

Pinning by tag, not digest. FROM node:20 is a moving target. For reproducible builds, pin by digest:

FROM node:20-alpine@sha256:abc123...

CI tools like Renovate / Dependabot can keep digest pins fresh.

No .dockerignore. Without it, COPY . . ships your node_modules, .git, .env, and IDE configs into the image. Always include node_modules, .git, .env*, dist/ in .dockerignore.

Inspecting your image

dive is the go-to TUI for inspecting layers — what got added in each, where the bloat lives, what files are duplicated:

dive my-app:latest

docker history is the simpler version — shows layer-by-layer size and creation command:

docker history my-app:latest

If a layer looks unexpectedly large, dive will tell you which file bloated it. Often it's a forgotten .git or node_modules snuck in by COPY . ..

A reproducible-build checklist

  • Pin your base image to a digest, not a moving tag.
  • Use .dockerignore aggressively; it's a one-line bug fix you'll thank yourself for.
  • Multi-stage to strip build tools.
  • BuildKit cache mounts for package managers.
  • Layer order: base → system deps → language runtime → app deps → source.
  • Run as non-root. USER node, USER nobody, USER 65532.
  • Scan with Trivy in CI before pushing.

Hit those and your images will be smaller, faster, and safer than 90% of what's published to Docker Hub.

Tools in the wild

5 tools
  • BuildKitfree tier

    Modern Docker builder — parallel stages, cache mounts, secret mounts.

    library
  • divefree tier

    TUI for inspecting image layers; pinpoints which COPY blew up your image size.

    cli
  • Distrolessfree tier

    Google's minimal runtime images — no shell, no apt, just glibc + runtime.

    library
  • Trivyfree tier

    Aqua's vulnerability scanner; runs in CI in seconds against any image.

    cli
  • buildx / bakefree tier

    Multi-platform, multi-target builds defined in HCL/JSON.

    cli