python · level 8

Iterators & Generators

Lazy sequences, yield, and the itertools toolbox.

125 XP

Iterators & Generators

Iteration in Python is a protocol, not a syntax. Anything that implements it can be used in a for loop, comprehension, list(), sum(), any(), etc.

The protocol

An iterable is anything with an __iter__ method that returns an iterator. An iterator is anything with a __next__ method that yields values until it raises StopIteration.

class Counter:
    def __init__(self, n):
        self.n = n
    def __iter__(self):
        self.i = 0
        return self
    def __next__(self):
        if self.i >= self.n:
            raise StopIteration
        self.i += 1
        return self.i

for x in Counter(3):
    print(x)        # 1 2 3

That's verbose. Generators are the shortcut.

Generators

A function with yield in it is a generator function. Calling it doesn't run the body — it returns a generator object. The body runs lazily, one yield at a time:

def counter(n):
    i = 0
    while i < n:
        i += 1
        yield i

for x in counter(3):
    print(x)        # 1 2 3

Equivalent to the verbose Counter class, but five lines instead of ten.

Why bother

Three reasons generators are worth knowing:

  1. Memory — they produce one value at a time. A generator over a billion items uses constant memory; a list does not.
  2. Composition — generators chain naturally. Build pipelines without intermediate lists.
  3. Infinite streams — you can iterate things that don't end (a counter, polling a queue, a random sequence).
def numbers():
    n = 0
    while True:
        yield n
        n += 1

# Iterate with care — this is infinite.
for x in numbers():
    if x > 10:
        break
    print(x)

Generator expressions

Like list comprehensions but with () instead of []. Lazy:

total = sum(x * x for x in big_iterable)        # no intermediate list
nonzero = (x for x in stream if x != 0)         # filter, lazily

sum, any, all, max, min all accept iterables. Pair them with generator expressions to avoid materialising lists.

itertools

The standard-library itertools module is a goldmine of iterator combinators:

from itertools import (
    chain, islice, cycle, count, repeat,
    takewhile, dropwhile, groupby, accumulate,
    product, permutations, combinations,
)

# First 10 squares.
list(islice((x*x for x in count()), 10))

# Accumulate running totals.
list(accumulate([1, 2, 3, 4]))           # [1, 3, 6, 10]

# Group consecutive equal values.
[(k, list(g)) for k, g in groupby("aaabbc")]
# [('a', ['a','a','a']), ('b', ['b','b']), ('c', ['c'])]

If you find yourself writing a loop with manual state, check itertools first — there's often a one-liner.

Sending values into a generator

Generators can also receive values via .send(). Rare but powerful — coroutines were built on this before async/await landed:

def echo():
    while True:
        msg = yield
        print(f"got: {msg}")

g = echo()
next(g)                  # prime
g.send("hello")          # got: hello
g.send("world")          # got: world

yield from

Delegate to another iterable:

def chained():
    yield from range(3)
    yield from "abc"

list(chained())          # [0, 1, 2, 'a', 'b', 'c']

Saves a for x in inner: yield x loop and forwards send / throw through.

When NOT to generator

If you're going to iterate the result more than once, use a list. Generators are single-use:

g = (x*x for x in range(5))
sum(g)                   # 30
sum(g)                   # 0 — generator exhausted!

Tip

A generator is the right answer when you have a stream. A list is the right answer when you have a collection. Streams produce values over time or on demand; collections are values you already have.