datatypes · level 2

Collections

Array, list, map, set, tuple — when to reach for each.

200 XP

Collections

Picking the right collection is usually the difference between a system that handles 10 million records and one that grinds to a halt at 10,000. The defaults in your language matter: a JS Array, a Python list, a Java ArrayList all behave differently than their names suggest.

Analogy

Think of how you store physical stuff in a garage. A long shelf of numbered bins (array) is perfect if you always know bin 47 holds the paintbrushes, but inserting a new bin between 4 and 5 means shuffling everything along. A chain of linked crates (linked list) lets you splice a new crate in instantly, but finding crate 47 means walking from crate 1. A pegboard with labelled hooks (hash map) lets you hang the paintbrushes on the "brushes" hook and grab them in one glance — you just have to accept the hooks are in no particular order. Same garage, same objects, wildly different time to find your screwdriver.

Array / list

An array is a contiguous block of memory indexed by integer position. Random access (arr[i]) is O(1). Append is O(1) amortised — the implementation over-allocates and only occasionally copies to a bigger buffer.

What trips people up:

  • Prepend is O(n). arr.unshift(x) in JS or list.insert(0, x) in Python shifts every other element right by one.
  • Membership is O(n). x in [1,2,3] does a linear scan.
  • Removal by index past the middle is also O(n) for the same shift reason.

Use an array when you need ordered data, random access, and mostly-append workloads.

Map / dict

A hash map gives O(1) average-case lookup, insert, and delete. The worst case is O(n) under pathological collisions — real implementations use random hashing to avoid that in practice.

  • JS: Map or plain objects {}. Use Map when keys aren't strings.
  • Python: dict
  • Java: HashMap / ConcurrentHashMap
  • Go: map[K]V

When you have an ID and need to look up the corresponding record, use a map. Don't scan an array.

Gotcha: in JavaScript, a plain object {} has inherited keys from Object.prototype (__proto__, toString, etc.). Use Object.create(null) or a Map when the keys are untrusted input.

Set

A set is a map with only keys — O(1) membership, O(1) insertion, no duplicates. Use it for:

  • Dedupe: new Set([1, 1, 2, 3]){1, 2, 3}
  • Presence tracking: "have I seen this email before?"
  • Mathematical set operations: union, intersection, difference.

Tuple

A tuple is a fixed-size, often heterogeneous record. In Python, tuples are immutable: point = (3.0, 4.0). In TypeScript, you type them as [number, number]. Use tuples when you have exactly N values of known-but-mixed types.

Tuples are also the standard way to return multiple values from a function.

Linked list

A linked list has O(1) insertion and deletion anywhere (if you have a pointer to the node), but O(n) random access. Its niche is:

  • Queues/deques where you add at one end and remove from the other.
  • LRU caches where you splice nodes around.

In most modern languages you rarely use a linked list directly — Java has LinkedList, Python has collections.deque. JavaScript has no built-in; Array with push/shift is good enough for most use cases (though shift is O(n)).

The right-tool matrix

Operation array linked-list map set
append O(1)* O(1) O(1)
prepend O(n) O(1)
random-access by index O(1) O(n)
lookup by key O(1)
contains O(n) O(n) O(1) O(1)
remove by value O(n) O(n) O(1) O(1)

* amortised.

The decision

Start with: "what's the most common operation?"

  • Reading by index → array
  • Looking up by key → map
  • Checking "have I seen this?" → set
  • Returning a small fixed record → tuple
  • Inserting at both ends → deque / linked-list

If you guess wrong and benchmarks show a hot loop doing O(n) work, switching the collection type is usually a one-line change.