cicd · level 5

Secrets in CI

OIDC trust, secret stores, and the leaked-token playbook.

200 XP

Secrets in CI

Every CI pipeline needs secrets. Cloud credentials, API keys, signing certificates, container registry passwords. How those secrets reach the runner — and what happens when one leaks — is the difference between "we have a bad day" and "we lose customer data".

The static-key trap

The pattern that built CI for two decades:

env:
  AWS_ACCESS_KEY_ID:     ${{ secrets.AWS_ACCESS_KEY_ID }}
  AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

A static, long-lived IAM access key sits in the CI provider's secret store. Every build that needs it has it. Whoever can edit the workflow can echo $AWS_SECRET_ACCESS_KEY > /tmp/x (after a quick base64 to bypass log masking) and exfiltrate it.

Even without a malicious actor, static keys leak via:

A test that prints process.env on failure.
A docker build that runs env into a layer.
An accidental git push of a .env file.
A debug log from a third-party library.
Forking a private repo to a public one.

GitGuardian found ~12 million secrets exposed in public GitHub commits in 2023 alone. Most were static API keys.

OIDC — the modern fix

GitHub Actions, GitLab CI, CircleCI, and Buildkite all support OIDC trust with major clouds. The flow:

The CI runner asks its CI provider for a short-lived signed JWT — an OIDC token.
The runner presents that JWT to AWS STS / GCP / Azure.
The cloud verifies the JWT (signed by GitHub's well-known JWKS endpoint), checks the sub claim against a trust policy, and returns short-lived credentials (15 min - 12 hr).
The job uses those temporary credentials.
They expire when the job ends.

The cloud's trust policy is the security boundary:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "Federated": "arn:...:oidc-provider/token.actions.githubusercontent.com" },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": { "token.actions.githubusercontent.com:aud": "sts.amazonaws.com" },
      "StringLike":   { "token.actions.githubusercontent.com:sub": "repo:my-org/my-repo:ref:refs/heads/main" }
    }
  }]
}

The sub condition restricts WHO can assume the role:

repo:my-org/my-repo:ref:refs/heads/main — only main branch in my-org/my-repo.
repo:my-org/my-repo:environment:production — only jobs targeting the production environment.
repo:my-org/my-repo:pull_request — only pull-request runs.

Tighten this. A repo:my-org/* wildcard means any forked PR can assume the role.

Where static secrets still belong

Some secrets aren't IAM-shaped:

Database passwords for a hosted Postgres.
Third-party API keys (SendGrid, Stripe webhooks, OpenAI).
Signing keys for code-signing certs.
Old systems that don't speak OIDC.

These belong in a secret manager:

AWS Secrets Manager / SSM Parameter Store
GCP Secret Manager
Azure Key Vault
HashiCorp Vault
Doppler

Grant the CI's role IAM access to fetch the secret at runtime. Never copy the secret value into CI provider's secret store as a plaintext build env var — once it's there, every workflow that runs has it, and removing it is a manual chore.

- uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::123456789012:role/ci-build
    aws-region: us-east-1

- name: Fetch DB password
  run: |
    DB_PASSWORD=$(aws secretsmanager get-secret-value \
                   --secret-id prod/db --query SecretString --output text)
    # Use $DB_PASSWORD in subsequent steps. Don't echo it.

Log masking — useful but not enough

GitHub Actions automatically masks values from secrets.* in build logs:

$ echo $MY_SECRET
***

This catches accidental prints. It does NOT catch:

echo $MY_SECRET | base64 — the base64 output is unmasked.
cat /tmp/secret-file — same.
A request body that contains the secret as part of legitimate use.
Stdout from a subprocess that prints to its own logs.

Treat masking as a backstop, not a guarantee. The real defense is: don't run third-party code on a secret-bearing runner unless you trust it.

The leaked-token playbook

Find a leaked credential. What now?

Rotate immediately. Revoke the existing credential. Create a new one. Until step 1 is done, every other step is wasted — the leaked credential is still valid.
Audit usage. Pull CloudTrail (AWS) / Cloud Audit Logs (GCP) / Activity Log (Azure) for every action made under the leaked credential's identity from the moment of exposure to rotation. What was created? Read? Deleted? Modified?
Determine blast radius. Did the credential have access to customer data? Production? Just the build artifact bucket? Treat the worst case as the actual case unless you can prove otherwise.
Notify. If customer data was reachable, your incident response process should include legal/comms/customer notification. Don't decide solo — escalate.
Fix the root cause. How did it leak? Public repo? Logged response? Misconfigured fork? Whatever the root cause, fix it before re-issuing.
Post-mortem and broaden. Was this credential the only one with that exposure pattern? If you can leak one, you can leak ten. Find the others.

The single most important step is #1. Engineering instincts say "investigate first" — that's wrong here. Investigation is faster after rotation because the bleeding has stopped.

What to lint for

A pre-commit + CI scan should catch the recurring patterns:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/trufflesecurity/trufflehog
    rev: v3.x
    hooks:
      - id: trufflehog
        entry: trufflehog filesystem --only-verified .
        language: system
        stages: [commit]

GitGuardian, GitHub's secret scanning, GitLab's secret detection — pick one. The free options catch most things; the paid ones tune for your specific token formats.

Lifetime ladder

In rough order of preference:

OIDC-derived ephemeral credentials — minutes long, bound to a specific role/job.
Secrets fetched at runtime from a secret manager via IAM — secret rotates centrally.
CI provider's encrypted secret store — last resort for things you can't push to a manager.
.env.local on a developer's laptop — never in CI, never in git.
Plaintext in the repo — fired-on-the-spot territory.

The ladder is the playbook. Climb it where you can, and document why where you can't.

Summary

Stop using static cloud credentials in CI. Switch to OIDC. For non-cloud secrets, use a secret manager and grant runtime IAM access. Mask logs as a backstop, not a guarantee. When a token leaks, ROTATE first, investigate after.

Tools in the wild

5 tools

HashiCorp Vaultfree tier
Multi-cloud secret store with dynamic credentials, transit encryption, and identity-based ACLs.
service
AWS Secrets Manager
AWS-native secret store with KMS encryption and automatic rotation hooks.
service
GitGuardianfree tier
Scans repos and CI logs for accidentally committed secrets; integrates as a pre-receive hook.
service
trufflehogfree tier
Open-source secret scanner — runs locally over a repo or in CI.
cli
doppler
Centralised secret manager with CI integrations and per-env access controls.
service