symmetric · level 3

Padding

PKCS#7, why a whole extra block, and the padding oracle.

170 XP

Padding

Block cipher modes like CBC need input that's an exact multiple of the block size (16 bytes for AES). Real data isn't usually that neat. Padding fills the final block with extra bytes — and then tells the decryptor how many of those bytes to strip back off.

Analogy

When you pack a fragile gift into a standard-size shipping box, you fill the empty space with packing peanuts. But how does the recipient know which peanuts are filler and which are part of the present? PKCS#7 solves this with a clever trick: stamp every packing peanut with the exact number of peanuts in the box. Eleven peanuts? Each one says "11". Unwrap from the top, read the first peanut, scoop out that many peanuts, and the real gift is underneath. If the box was already full to the brim, you still add one extra layer stamped with "16" — otherwise the recipient can't tell where the gift ends and nonexistent padding begins.

PKCS#7 — the standard scheme

PKCS#7 (defined in RFC 5652) is the padding you'll meet in practice. The rule is disarmingly simple:

Fill the remainder of the final block with bytes whose value equals the number of padding bytes added.

If you need 5 padding bytes, you append 05 05 05 05 05. If you need 1, you append 01.

plaintext "HELLO" (5 bytes), block size 16
→ 48 45 4C 4C 4F  0B 0B 0B 0B 0B 0B 0B 0B 0B 0B 0B
                 └───────── 11 bytes of 0x0B ────────┘

What if the input is already aligned?

You still add a full block of padding — 16 bytes of 0x10. Otherwise the unpadder can't tell "this 16-byte block has no padding" from "this 16-byte block's last byte is a real 0x01".

plaintext: 16 bytes, block size 16
→ [16 plaintext bytes] [10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10]

This looks wasteful but it makes unpadding unambiguous: read the last byte, that's always the padding length.

Unpadding

  1. Read the last byte → n.
  2. Check 1 ≤ n ≤ block_size.
  3. Check the last n bytes are all equal to n.
  4. If any check fails → invalid padding.
  5. Otherwise strip the last n bytes.

The padding oracle — a security trap

Here's where it gets dangerous. Imagine a server that:

  1. Decrypts a user-supplied AES-CBC ciphertext.
  2. Checks PKCS#7 padding — returns error A if bad.
  3. Checks a MAC on the plaintext — returns error B if bad.

Different errors → two different response signals. Vaudenay (2002) showed this is enough to recover the plaintext one byte at a time. The attacker doesn't need the key.

The attack works because the attacker can tamper with ciphertext blocks and ask the server "is this still a valid pad?" On each question they learn one bit. With enough queries (~256 per byte) they recover the whole plaintext. Variants have broken TLS (BEAST, POODLE, Lucky13), ASP.NET cookies, and many custom protocols.

Three defences

  1. Use AEAD (AES-GCM). AEAD modes don't pad — they produce ciphertext exactly as long as the plaintext, and the authentication tag is checked before anything else. No oracle possible.
  2. Encrypt-then-MAC, in constant time. If you must use CBC, compute an HMAC over the ciphertext and verify that first. Only if the MAC passes do you decrypt.
  3. Unify error responses. Never tell a client whether padding or MAC failed — return the same generic error, after a constant-time check, for both.

Modern crypto libraries (libsodium, Web Crypto's AES-GCM) remove this footgun by default. If you're reaching for AES-CBC with PKCS#7 in 2025, ask yourself whether AES-GCM would work instead. Usually it will.

Takeaways

  1. PKCS#7: last byte = padding length; every padding byte has that value.
  2. Already aligned? Add a whole extra block of padding.
  3. Padding oracles turn CBC into a plaintext leak. Prefer AEAD (GCM).