programming · level 4

Collections

Lists, dicts, sets, tuples — and which to pick.

125 XP

Collections

Most real programs spend most of their time moving data around in collections. Picking the right one for the job is half the battle.

The four core shapes

Almost every language ships some flavour of these four:

Shape Lookup by Ordered? Duplicates?
List / Array integer index yes yes
Map / Dict arbitrary key usually no keys unique
Set membership usually no no
Tuple position yes yes (fixed size)

Names vary by language: Python has list, dict, set, tuple. JavaScript has Array, Map (and the loose-typed plain Object), Set. Go has slices, maps, and no built-in set. The shapes are the same.

Lists

Ordered, indexable, growable. The default container in most languages.

xs = [10, 20, 30]
xs[0]              # 10
xs[-1]             # 30 (last element, Python-only convention)
xs.append(40)      # [10, 20, 30, 40]
xs[1:3]            # [20, 30] — slice
len(xs)            # 4

Lookup by index is O(1). Searching for a value (xs.index(20)) is O(n) — you have to scan.

Dicts / Maps

Lookup by key, key can be (almost) anything hashable.

user = {"name": "Alice", "age": 30}
user["name"]                    # "Alice"
user["email"] = "alice@x.com"   # add or overwrite
"age" in user                   # True
user.get("missing", "default")  # safe lookup with fallback
del user["age"]                 # remove

Lookup, insertion, and deletion are all average O(1) thanks to hashing. The catch: order isn't guaranteed in older language versions (Python ≥3.7 keeps insertion order; Go does NOT).

Sets

Like a dict where you only care about the keys. Use when you need fast "is this in the collection?" checks or want to dedupe.

seen = set()
for x in stream:
    if x in seen:
        continue
    seen.add(x)
    process(x)

Membership is O(1) average. Beats if x in some_list (O(n)) for any meaningful collection size.

Tuples

Fixed-size, ordered, immutable. Use when you have a small handful of related values that don't deserve a class:

def divmod(a, b):
    return a // b, a % b           # tuple

origin = (0, 0)
x, y = origin                      # destructure

In statically-typed languages, tuples are usually heterogeneous: Tuple<int, string> is fine; a List<int> is not.

Iterating

Every collection supports iteration:

for x in xs: ...                   # list
for k, v in user.items(): ...      # dict
for x in seen: ...                 # set

A list comprehension builds a new list from another iterable in one expression:

squares = [x * x for x in range(10)]
evens = [x for x in xs if x % 2 == 0]

JavaScript spells it differently with .map() / .filter():

const squares = Array.from({length: 10}, (_, x) => x * x);
const evens = xs.filter(x => x % 2 === 0);

Choosing the right shape

Three rules of thumb:

  1. Need lookup by integer position? List.
  2. Need lookup by key? Dict.
  3. Need fast membership / dedup? Set.

If you're doing if x in some_list repeatedly inside a loop, you've probably built a hidden O(n²). Convert to a set first.

Mutation gotchas

In most languages, lists, dicts, and sets are passed by reference. Mutating one inside a function affects the caller:

def add_admin(users):
    users.append("admin")          # mutates the caller's list

xs = ["alice", "bob"]
add_admin(xs)
xs                                 # ["alice", "bob", "admin"] — surprising!

If you want a fresh copy, ask explicitly: users.copy() or list(users). Tuples and strings are immutable, so this isn't an issue for them.