Lab 7: Iterators, Generators

Due by 11:59pm on Tuesday, July 18.

Starter Files

Download lab07.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the Ok autograder.

Topics

Consult this section if you need a refresher on the material for this lab. It's okay to skip directly to the questions and refer back here should you get stuck.


Iterators

An iterable is any object that can be iterated through, or gone through one element at a time. One construct that we've used to iterate through an iterable is a for loop:

for elem in iterable:
    # do something

for loops work on any object that is iterable. We previously described it as working with any sequence -- all sequences are iterable, but there are other objects that are also iterable! We define an iterable as an object on which calling the built-in iter function returns an iterator. An iterator is another type of object that allows us to iterate through an iterable by keeping track of which element is next in the sequence.

To illustrate this, consider the following block of code, which does the exact same thing as the for statement above:

iterator = iter(iterable)
try:
    while True:
        elem = next(iterator)
        # do something
except StopIteration:
    pass

Here's a breakdown of what's happening:

  • First, the built-in iter function is called on the iterable to create a corresponding iterator.
  • To get the next element in the sequence, the built-in next function is called on this iterator.
  • When next is called but there are no elements left in the iterator, a StopIteration error is raised. In the for loop construct, this exception is caught and execution can continue.

Calling iter on an iterable multiple times returns a new iterator each time with distinct states (otherwise, you'd never be able to iterate through a iterable more than once). You can also call iter on the iterator itself, which will just return the same iterator without changing its state. However, note that you cannot call next directly on an iterable.

Let's see the iter and next functions in action with an iterable we're already familiar with -- a list.

>>> lst = [1, 2, 3, 4]
>>> next(lst)             # Calling next on an iterable
TypeError: 'list' object is not an iterator
>>> list_iter = iter(lst) # Creates an iterator for the list
>>> list_iter
<list_iterator object ...>
>>> next(list_iter)       # Calling next on an iterator
1
>>> next(list_iter)       # Calling next on the same iterator
2
>>> next(iter(list_iter)) # Calling iter on an iterator returns itself
3
>>> list_iter2 = iter(lst)
>>> next(list_iter2)      # Second iterator has new state
1
>>> next(list_iter)       # First iterator is unaffected by second iterator
4
>>> next(list_iter)       # No elements left!
StopIteration
>>> lst                   # Original iterable is unaffected
[1, 2, 3, 4]

Since you can call iter on iterators, this tells us that that they are also iterables! Note that while all iterators are iterables, the converse is not true - that is, not all iterables are iterators. You can use iterators wherever you can use iterables, but note that since iterators keep their state, they're only good to iterate through an iterable once:

>>> list_iter = iter([4, 3, 2, 1])
>>> for e in list_iter:
...     print(e)
4
3
2
1
>>> for e in list_iter:
...     print(e)

Analogy: An iterable is like a book (one can flip through the pages) and an iterator for a book would be a bookmark (saves the position and can locate the next page). Calling iter on a book gives you a new bookmark independent of other bookmarks, but calling iter on a bookmark gives you the bookmark itself, without changing its position at all. Calling next on the bookmark moves it to the next page, but does not change the pages in the book. Calling next on the book wouldn't make sense semantically. We can also have multiple bookmarks, all independent of each other.

Iterable Uses

We know that lists are one type of built-in iterable objects. You may have also encountered the range(start, end) function, which creates an iterable of ascending integers from start (inclusive) to end (exclusive).

>>> for x in range(2, 6):
...     print(x)
...
2
3
4
5

Ranges are useful for many things, including performing some operations for a particular number of iterations or iterating through the indices of a list.

There are also some built-in functions that take in iterables and return useful results:

  • map(f, iterable) - Creates an iterator over f(x) for x in iterable. In some cases, computing a list of the values in this iterable will give us the same result as [func(x) for x in iterable]. However, it's important to keep in mind that iterators can potentially have infinite values because they are evaluated lazily, while lists cannot have infinite elements.
  • filter(f, iterable) - Creates an iterator over x for each x in iterable if f(x)
  • zip(iterables*) - Creates an iterator over co-indexed tuples with elements from each of the iterables
  • reversed(iterable) - Creates an iterator over all the elements in the input iterable in reverse order
  • list(iterable) - Creates a list containing all the elements in the input iterable
  • tuple(iterable) - Creates a tuple containing all the elements in the input iterable
  • sorted(iterable) - Creates a sorted list containing all the elements in the input iterable
  • reduce(f, iterable) - Must be imported with functools. Apply function of two arguments f cumulatively to the items of iterable, from left to right, so as to reduce the sequence to a single value.


Generators

We can create our own custom iterators by writing a generator function, which returns a special type of iterator called a generator. Generator functions have yield statements within the body of the function instead of return statements. Calling a generator function will return a generator object and will not execute the body of the function.

For example, let's consider the following generator function:

def countdown(n):
    print("Beginning countdown!")
    while n >= 0:
        yield n
        n -= 1
    print("Blastoff!")

Calling countdown(k) will return a generator object that counts down from k to 0. Since generators are iterators, we can call iter on the resulting object, which will simply return the same object. Note that the body is not executed at this point; nothing is printed and no numbers are outputted.

>>> c = countdown(5)
>>> c
<generator object countdown ...>
>>> c is iter(c)
True

So how is the counting done? Again, since generators are iterators, we call next on them to get the next element! The first time next is called, execution begins at the first line of the function body and continues until the yield statement is reached. The result of evaluating the expression in the yield statement is returned. The following interactive session continues from the one above.

>>> next(c)
Beginning countdown!
5

Unlike functions we've seen before in this course, generator functions can remember their state. On any consecutive calls to next, execution picks up from the line after the yield statement that was previously executed. Like the first call to next, execution will continue until the next yield statement is reached. Note that because of this, Beginning countdown! doesn't get printed again.

>>> next(c)
4
>>> next(c)
3

The next 3 calls to next will continue to yield consecutive descending integers until 0. On the following call, a StopIteration error will be raised because there are no more values to yield (i.e. the end of the function body was reached before hitting a yield statement).

>>> next(c)
2
>>> next(c)
1
>>> next(c)
0
>>> next(c)
Blastoff!
StopIteration

Separate calls to countdown will create distinct generator objects with their own state. Usually, generators shouldn't restart. If you'd like to reset the sequence, create another generator object by calling the generator function again.

>>> c1, c2 = countdown(5), countdown(5)
>>> c1 is c2
False
>>> next(c1)
5
>>> next(c2)
5

Here is a summary of the above:

  • A generator function has a yield statement and returns a generator object.
  • Calling the iter function on a generator object returns the same object without modifying its current state.
  • The body of a generator function is not evaluated until next is called on a resulting generator object. Calling the next function on a generator object computes and returns the next object in its sequence. If the sequence is exhausted, StopIteration is raised.
  • A generator "remembers" its state for the next next call. Therefore,

    • the first next call works like this:

      1. Enter the function and run until the line with yield.
      2. Return the value in the yield statement, but remember the state of the function for future next calls.
    • And subsequent next calls work like this:

      1. Re-enter the function, start at the line after the yield statement that was previously executed, and run until the next yield statement.
      2. Return the value in the yield statement, but remember the state of the function for future next calls.
  • Calling a generator function returns a brand new generator object (like calling iter on an iterable object).
  • A generator should not restart unless it's defined that way. To start over from the first element in a generator, just call the generator function again to create a new generator.

Another useful tool for generators is the yield from statement. yield from will yield all values from an iterator or iterable.

>>> def gen_list(lst):
...     yield from lst
...
>>> g = gen_list([1, 2, 3, 4])
>>> next(g)
1
>>> next(g)
2
>>> next(g)
3
>>> next(g)
4
>>> next(g)
StopIteration

Required Questions


Getting Started Videos

These videos may provide some helpful direction for tackling the coding problems on this assignment.

To see these videos, you should be logged into your berkeley.edu email.

YouTube link

Iterators

Q1: WWPD: Iterators

Important: Enter StopIteration if a StopIteration exception occurs, Error if you believe a different error occurs, and Iterator if the output is an iterator object.

Use Ok to test your knowledge with the following "What Would Python Display?" questions:

python3 ok -q iterators-wwpd -u

Python's built-in map, filter, and zip functions return iterators, not lists. These built-in functions are different from the my_map and my_filter functions we implemented in Lab 04.

>>> s = [1, 2, 3, 4]
>>> t = iter(s)
>>> next(s)
______
Error
>>> next(t)
______
1
>>> next(t)
______
2
>>> iter(s)
______
Iterator
>>> next(iter(s))
______
1
>>> next(iter(t))
______
3
>>> next(iter(s))
______
1
>>> next(iter(t))
______
4
>>> next(t)
______
StopIteration
>>> r = range(6)
>>> r_iter = iter(r)
>>> next(r_iter)
______
0
>>> [x + 1 for x in r]
______
[1, 2, 3, 4, 5, 6]
>>> [x + 1 for x in r_iter]
______
[2, 3, 4, 5, 6]
>>> next(r_iter)
______
StopIteration
>>> list(range(-2, 4)) # Converts an iterable into a list
______
[-2, -1, 0, 1, 2, 3]
>>> map_iter = map(lambda x : x + 10, range(5))
>>> next(map_iter)
______
10
>>> next(map_iter)
______
11
>>> list(map_iter)
______
[12, 13, 14]
>>> for e in filter(lambda x : x % 2 == 0, range(1000, 1008)): ... print(e)
______
1000 1002 1004 1006
>>> [x + y for x, y in zip([1, 2, 3], [4, 5, 6])]
______
[5, 7, 9]
>>> for e in zip([10, 9, 8], range(3)): ... print(tuple(map(lambda x: x + 2, e)))
______
(12, 2) (11, 3) (10, 4)

Q2: Count Occurrences

Implement count_occurrences, which takes in an iterator t and returns the number of times the value x appears in the first n elements of t. A value appears in a sequence of elements if it is equal to an entry in the sequence.

Note: You can assume that t will have at least n elements.

Hint: When the same iterator is passed into a function twice, it should pick up where it left off. Take note of how we pass the iterator s twice in the doctests into two consecutive calls to count_occurrences. Here, s didn't restart from the beginning of the iterable since in the second count_occurrences call, the iterator goes through three 2s, starting from the second value in the iterator, not the first.

def count_occurrences(t, n, x):
    """Return the number of times that x appears in the first n elements of iterator t.

    >>> s = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
    >>> count_occurrences(s, 10, 9)
    3
    >>> s2 = iter([10, 9, 10, 9, 9, 10, 8, 8, 8, 7])
    >>> count_occurrences(s2, 3, 10)
    2
    >>> s = iter([3, 2, 2, 2, 1, 2, 1, 4, 4, 5, 5, 5])
    >>> count_occurrences(s, 1, 3)
    1
    >>> count_occurrences(s, 3, 2)
    3
    >>> next(s)
    1
    >>> s2 = iter([4, 1, 6, 6, 7, 7, 8, 8, 2, 2, 2, 5])
    >>> count_occurrences(s2, 6, 6)
    2
    """
    "*** YOUR CODE HERE ***"

Use Ok to test your code:

python3 ok -q count_occurrences

Generators

Q3: Apply That Again

Implement amplify, a generator function that takes a one-argument function f and a starting value x. The element at index k that it yields (starting at 0) is the result of applying f k times to x. It terminates whenever the next value it would yield is a falsy value, such as 0, "", [], False, etc.

def amplify(f, x):
    """Yield the longest sequence x, f(x), f(f(x)), ... that are all true values

    >>> list(amplify(lambda s: s[1:], 'boxes'))
    ['boxes', 'oxes', 'xes', 'es', 's']
    >>> list(amplify(lambda x: x//2-1, 14))
    [14, 6, 2]
    """
    "*** YOUR CODE HERE ***"

Use Ok to test your code:

python3 ok -q amplify

Q4: Sequence Generator

Implement sequence_gen, a recursive generator function that takes in a single-argument function term. sequence_gen should produce an infinite generator that yields term(1), term(2), term(3), etc.

You must use recursion! You will fail the doctests if you use for or while loops.

def sequence_gen(term):
    """
    Yields term(1), term(2), term(3), ...

    >>> s1 = sequence_gen(lambda x: x**2)
    >>> [next(s1) for _ in range(5)]
    [1, 4, 9, 16, 25]
    >>> # Do not use while/for loops!
    >>> from construct_check import check
    >>> # ban iteration
    >>> check(LAB_SOURCE_FILE, 'sequence_gen',
    ...       ['While', 'For'])
    True
    """
    yield ___________________
    yield from ___________________

Use Ok to test your code:

python3 ok -q sequence_gen

Q5: Stair Ways

In discussion 3, we considered how many different ways there are to climb a flight of stairs with n steps if you are able to take 1 or 2 steps at a time. In this problem, you will write a generator function stair_ways that yields all the different ways you can climb such a staircase.

Each "way" of climbing a staircase is represented by a list of 1s and 2s, representing the sequence of step sizes a person should take to climb the flight.

For example, for a flight with 3 steps, there are three ways to climb it:

  • You can take one step each time: [1, 1, 1].
  • You can take two steps then one step: [2, 1].
  • You can take one step then two steps: [1, 2]..

Therefore, stair_ways(3) should yield [1, 1, 1], [2, 1], and [1, 2] in any order.

def stair_ways(n):
    """
    Yields all ways to climb a set of N stairs taking
    1 or 2 steps at a time.

    >>> list(stair_ways(0))
    [[]]
    >>> s_w = stair_ways(4)
    >>> sorted([next(s_w) for _ in range(5)])
    [[1, 1, 1, 1], [1, 1, 2], [1, 2, 1], [2, 1, 1], [2, 2]]
    >>> try: # Ensure you're not yielding extra
    ...     next(s_w)
    ...     assert False
    ... except StopIteration:
    ...     pass
    """
    "*** YOUR CODE HERE ***"

Use Ok to test your code:

python3 ok -q stair_ways

Optional Questions

These questions are optional, but you must complete them in order to be checked off before the end of the lab period. They are also useful practice!

Q6: Church generator

Implement church_generator, a generator function that takes in a function f as an argument. church_generator yields functions that apply f to their argument one more time than the previously generated function. (The yielded functions are known as Church numerals.)

def church_generator(f):
    """Takes in a function f and yields functions which apply f
    to their argument one more time than the previously generated
    function.

    >>> increment = lambda x: x + 1
    >>> church = church_generator(increment)
    >>> for _ in range(5):
    ...     fn = next(church)
    ...     print(fn(0))
    0
    1
    2
    3
    4
    """
    g = ____________________________________________
    while True:
        ____________________________________________ 
        ____________________________________________

Use Ok to test your code:

python3 ok -q church_generator

Submit

Make sure to submit this assignment by uploading any files you've edited to the appropriate Gradescope assignment. For a refresher on how to do this, refer to Lab 00.