encoding · level 4

URL / Percent Encoding

Why `?q=hello%20world` looks the way it does.

100 XP

URL / Percent Encoding

URLs were designed to be typed and printed. Percent encoding is how disallowed characters survive in a URL. Every disallowed byte becomes %XX where XX is that byte's two hex digits.

Analogy

Think of addressing a postcard. The postal system accepts letters, digits, and a few punctuation marks, but if you scribble a smiley face or a foreign script in the address line, the sorting machine chokes. So you spell the tricky bits out phonetically — "N as in November" — using only marks the machine understands. Percent encoding is that spelling-out: each forbidden byte gets replaced by a % followed by its code, so the whole address survives the sorter unaltered.

Reserved vs unreserved

  • Unreserved (always safe): A-Z, a-z, 0-9, -, _, ., ~
  • Reserved: :/?#[]@!$&'()*+,;=
  • Everything else: must be encoded

Why this matters for crypto

When you put a Base64 ciphertext into a URL, some Base64 characters (+, /, =) need to be percent-encoded, OR you should use Base64url — a variant that swaps + to - and / to _.

Tools in the wild

5 tools
  • Browser + Node WHATWG URL parser; `searchParams.set()` percent-encodes correctly.

    library
  • `quote`, `quote_plus`, `urlencode` — the canonical Python URL encoders.

    library
  • Encodes form data correctly from the CLI; safer than hand-rolling query strings.

    cli
  • API client that auto-encodes path + query params and shows the resolved URL.

    service
  • URI generic syntax — defines reserved/unreserved chars and percent-encoding rules.

    spec