encoding · level 1

ASCII & Unicode

How text becomes numbers.

100–150 XP

ASCII & Unicode

When you type the letter A on your keyboard, the computer doesn't store a letter — it stores a number. That number is the codepoint for A: 65. When you type B, it stores 66. This mapping from characters to numbers is called a character encoding.

Analogy

Think of a library's card catalog. Every book on the shelves has a call number — 813.54 MOR — and the librarian never moves books by their title, only by that number. The title is for humans; the number is for the shelving system. A character encoding is the same idea applied to writing: every letter, digit, and symbol is assigned a unique number that the computer uses internally, and the "A" you see on screen is just a label on the shelf.

ASCII — the first 128

The original encoding was ASCII, which covers the basic Latin alphabet, digits, punctuation, and control characters. It fits in 7 bits, so every ASCII character is a number from 0 to 127.

Character	Codepoint	Hex
`A`	65	0x41
`a`	97	0x61
`0`	48	0x30
`!`	33	0x21
space	32	0x20

Unicode — every character on Earth

ASCII only covers English. Unicode extends it to every writing system, every emoji, every symbol — over a million codepoints.

UTF-8 — how Unicode becomes bytes

Computers store bytes, not codepoints. UTF-8 is the encoding that packs Unicode codepoints into bytes: ASCII fits in 1 byte, European accents in 2, most other scripts in 3, emoji in 4.

This is not encryption

Encoding a character as a number isn't a secret. Anyone with the table can reverse it. Keep this in mind as you go — encoding transforms data into a different representation; encryption transforms data so that only the key holder can read it.