Storage Tiers
Object vs block vs file — pick the right shape.
Storage Tiers
The cloud gives you three fundamentally different storage shapes. Picking the wrong one costs you either money or correctness — sometimes both. There is no universal "best"; the shape of the workload decides.
Analogy
Think of three ways to store your belongings at home. Block storage is the top drawer of your desk — instantly reachable, you can grab a pen mid-sentence, but only you have a key and the drawer only fits so much. File storage is the shared filing cabinet in a small office — several colleagues can open it at once and browse folders, but everyone is reaching into the same cabinet so it slows down when the room is crowded. Object storage is a self-storage warehouse across town — effectively infinite, cheap per square foot, and each unit has a unique address, but you don't pop in to grab a sock; you drive out with a list.
The three shapes
| Shape | What it is | Example services | Access pattern |
|---|---|---|---|
| Object | A flat keyspace of immutable blobs | S3, GCS, Azure Blob | Whole-object PUT/GET over HTTPS |
| Block | A raw attached disk of fixed-size blocks | EBS, Persistent Disk, Azure Managed Disk | Read/write arbitrary offsets, one mount |
| File | A shared POSIX filesystem | EFS, Filestore, Azure Files | Many clients read/write concurrently |
Pick object when the unit of work is a whole file. Pick block when you need a filesystem on one VM. Pick file when many VMs need the same mount.
Object storage
The S3 mental model: a giant key-value store where keys look like paths and values are blobs up to 5 TB. Keys have no directory hierarchy — the / is just another character.
- Cheapest per GB. Standard tier ~$0.023/GB-month. Glacier Deep Archive is ~$0.00099/GB-month (retrieval hours to restore).
- Strong read-after-write consistency for PUT→GET of the same key, and for DELETE→GET returns the deletion.
- Eventual consistency for bucket listings.
- High tail latency. P99 is in the hundreds of milliseconds. No good for random low-latency I/O.
- Scales horizontally. No provisioning. Throughput per prefix is capped; for extreme throughput, spread writes across prefixes.
Workloads that fit: logs, backups, media, data lakes, static website assets, ML training data.
Block storage
EBS gives you a raw disk you attach to one EC2 instance. The instance owns the filesystem (ext4, XFS, NTFS) on top. You get consistent low-latency IOPS.
- Volume types matter. gp3 is the default — 3,000 IOPS baseline, 125 MB/s throughput, provisioned higher on demand. io2 Block Express goes up to 256k IOPS. st1/sc1 are throughput-optimised HDDs for cold data.
- One instance at a time. Multi-Attach exists for io2 but is a specialist tool.
- Snapshots are incremental and stored in S3. Restore creates a fresh volume, which lazy-loads blocks on first read.
- Root volumes for EC2 and Kubernetes node groups are block.
Workloads that fit: database data + WAL, single-instance filesystems, root volumes, cache tiers that need consistent p99 latency.
File storage
EFS gives you a POSIX filesystem you can mount from many clients at once over NFS. The filesystem scales automatically; you pay for storage + throughput.
- Concurrent access is the whole point. Fifty Lambdas, an auto-scaling group, and a Jenkins runner can all mount the same directory.
- More expensive than S3, cheaper than EBS. ~$0.30/GB-month Standard; Infrequent Access tier is ~$0.016/GB-month.
- Throughput modes: bursting (based on size), provisioned (fixed), elastic (scales automatically, highest cost).
Workloads that fit: shared content roots, CI/CD build caches shared across runners, lift-and-shift legacy apps that expect NFS, ML training datasets accessed by many jobs.
Pricing deltas that matter
- Object is cheapest per GB and scales to petabytes. Cost dominated by storage + request charges + egress.
- Block is the most expensive per GB but has no per-request fee after provisioning.
- File sits in the middle and bills extra for throughput.
The trap: treating EBS like a CDN. Serving static assets from EBS-backed EC2 is ~30× more expensive than S3+CloudFront at scale. Also: egress to the internet is the dominant cost line for many workloads.
Consistency models at a glance
- S3: strong for PUT→GET on the same key; eventual for list operations.
- EBS: strong (it's a single attached disk).
- EFS: close-to-open consistency like NFSv4 — a writer's changes become visible to other clients after it closes the file.
The decision tree
- Many writers need the same mount → file.
- One writer, need a filesystem → block.
- Unit of work is a whole file or blob → object.
Start from the workload, not the service. If you find yourself asking "how do I mount S3 as a filesystem?" you probably picked the wrong tier for the job.