Lab 11: Iterators and Generators
Due at 11:59pm on 4/15/2016.
Starter Files
Download lab11.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the OK autograder.
Submission
By the end of this lab, you should have submitted the lab with
python3 ok --submit
. You may submit more than once before the
deadline; only the final submission will be graded.
- To receive credit for this lab, you must complete Questions 1, 2, 3, 4, 5, 6 and 7 in lab11.py and submit through OK.
- Questions 8, 9, 10, 11 are extra practice. They can be found in the lab11_extra.py file. It is recommended that you complete these problems on your on time.
Helpful Hints
We have a new feature we are testing in OK. This feature tries to give helpful hints when it sees you struggling with unlocking questions.
To see if everything works you will be randomly placed in a group to either get
the hints or not. If you wish to make sure the hints appear for you in this lab,
run an unlocking command with the flag --guidance
. For example if the command
is usually python3 ok -q XXX -u
instead use python3 ok --guidance -q XXX
-u
. Once you use this flag, all following unlocking commands will give you
helpful hints regardless of if you include the flag again.
Iterables and Iterators
Remember the for
loop? (We really hope so!)
for elem in something_iterable:
# do something
for
loops only work with iterables, which we can define as values that
are acceptable to the built-in iter
function. The for
loop above, for
example, behaves like this:
the_iterator = iter(something_iterable)
try:
while True:
elem = the_iterator.__next__()
# do something
except StopIteration:
pass
That is, it first calls iter
to create an iterator, saving it in some
new, hidden variable (we've called it the_iterator
here). It then
repeatedly calls the __next__()
method on this iterator to get values of
elem
and stops when that method raises StopIteration
.
The term iterator simply means "an object that has a __next__()
method that
normally either returns a value or raises StopIteration
to indicate
completion of the iteration."
For something_iterable
to be acceptable to the iter
function, either it
must
- define an
__iter__
method that returns an iterator, as defined above, or - define a
__getitem__
method that takes an integer index argument and either returns a value or raisesIndexError
when the index argument exceeds some bound (such as the length of a list).
iter
handles these two cases as follows:
- In the first case,
iter
simply returns the result of applying__iter__
. - In the second, it creates an iterator object whose
__next__
method calls__getitem__(0)
,__getitem__(1)
, ..., until it receivesIndexError
, at which point this manufactured iterator raisesStopIteration
.
An iterable object can create an arbitrary number of iterator objects. In addition, the iterators can be independent of each other; in other words they can have a different position in the sequence.
The iterators produced by the predefined functions in the Python library
implement __iter__
methods that just return themselves (iterators you define
yourself should do this also). This is convenient,
because it means that if you have an expression that produces an iterator
(such as calls on the map
and zip
builtin functions), you can do a
for
loop over it as well. But be careful. Iterators are not the same as
familiar iterables like lists or tuples. Compare the following two cases:
>>> L = [ 1, 2 ]
>>> I = iter(L) # I is an iterator over L
>>> for x in L: print x
1
2
>>> for x in L: print x
1
2
>>> for x in I: print x
1
2
>>> for x in I: print x
The last for
loop produces nothing. Why? Because I
is an iterator: once
its __next__
method raises StopIteration
, it produces no more values. Since
I.__iter__()
simply returns I
, therefore, the last for
loop finds no more
values, whereas the second for x in L
loop calls __iter__
on L
,
which produces a new iterator. This is why we consider the terminology somewhat confusing:
lists and iterators may both be iterables and both be useable in for
, but
as the example shows, they behave differently.
Here is a table summarizing the required methods on iterables and iterators. Python also has more documentation about iterator types.
Iterable | Iterator |
---|---|
__iter__ : returns an iterator, or __getitem__ : is defined |
__iter__ : may return an iterator, which is generally itself |
__getitem__ : Not used; usually not defined. |
|
__next__ : need not be defined |
__next__ : returns the next element,
or raises StopIteration if no more values left to produce |
Analogy: an iterable is like a book (one can flip through the
pages) and an iterator is a bookmark (saves the position and can locate
the next page). Calling __iter__
on a book gives you a new bookmark
independent of other bookmarks, but calling __iter__
on a bookmark
gives you the bookmark itself, without changing its position at all.
Here is an example of a class definition for an object that implements the iterator interface:
class Skipper:
def __init__(self, seq):
self.seq = seq
self.step = 1
self.current = 0
def __next__(self):
if self.current > len(self.seq):
raise StopIteration
val = self.seq[self.current]
self.current += self.step
self.step += 1
return val
def __iter__(self):
return self
Let's go ahead and try out our new toy.
>>> for num in Skipper([3, 4, 5, 6, 7, 8, 9, 10]):
... print(num)
3
4
6
9
That is, we skip through the elements of some sequence in ever-larger steps.
We can also use Skipper
to create a new kind of iterable list:
class SkippingList(list):
def __iter__(self):
return Skipper(self)
and now we can write:
>>> aList = SkippingList([3, 4, 5, 6, 7, 8, 9, 10])
>>> for num in aList:
... print(num)
3
4
6
9
Question 1: Does it work?
Consider the following iterators. Which ones work and which ones don't? Why?
Use OK to test your knowledge with the following conceptual questions:
python3 ok -q does_it_work -u
class IteratorA:
def __init__(self):
self.start = 10
def __next__(self):
if self.start > 100:
raise StopIteration
self.start += 20
return self.start
def __iter__(self):
return self
No problem, this is a beautiful iterator.
class IteratorB:
def __init__(self):
self.start = 5
def __iter__(self):
return self
Oh no! Where is __next__
? This fails to implement the iterator
interface because calling __iter__
doesn't return something that has
a __next__
method.
class IteratorC:
def __init__(self):
self.start = 5
def __next__(self):
if self.start == 10:
raise StopIteration
self.start += 1
return self.start
This also fails to implement the iterator interface. Without the
__iter__
method, the for
loop will error. The for
loop needs to
call __iter__
first because some objects might not implement the
__next__
method themselves, but calling __iter__
will return an
object that does.
class IteratorD:
def __init__(self):
self.start = 1
def __next__(self):
self.start += 1
return self.start
def __iter__(self):
return self
This is technically an iterator, because it implements both __iter__
and
__next__
. Notice that it's an infinite sequence! Sequences like these are
the reason iterators are useful. Because iterators delay computation, we can
use a finite amount of memory to represent an infinitely long sequence.
Question 2: Restart
Use OK to test your knowledge with the following What would Python print questions:
python3 ok -q restart -u
Try this!
>>> iterator = IteratorA()
>>> [num for num in iterator]
Then again:
>>> [num for num in iterator]
The outputs of the list comprehensions are like that because the instance variables are not reset
each time a for
loop is started. Therefore, when a StopIteration
exception is raised at the end of the first for loop it certainly
will be raised immediately at the beginning of the second.
With that in mind, try writing an iterator that "restarts" every time
it is run through a for
loop.
class IteratorRestart:
"""
>>> iterator = IteratorRestart(2, 7)
>>> for num in iterator:
... print(num)
2
3
4
5
6
7
>>> for num in iterator:
... print(num)
2
3
4
5
6
7
"""
def __init__(self, start, end):
"*** YOUR CODE HERE ***"
self.start = start
self.end = end
self.current = start
def __next__(self):
"*** YOUR CODE HERE ***"
if self.current > self.end:
raise StopIteration
self.current += 1
return self.current - 1
def __iter__(self):
"*** YOUR CODE HERE ***"
self.current = self.start
return self
Use OK to test your code:
python3 ok -q IteratorRestart
Question 3: WWPP: Odds and Evens
So far, the __iter__
method of our iterators only returns self
. What if we have called next
a few times and then want to start at the beginning? Intuitively we should create a new iterator that would start at the beginning. However, our current iterator implementations won't allow that.
Consider the following OddNaturalsIterator
and EvenNaturalsIterator
, which implementation allows us to start a new iterator at the beginning?
class OddNaturalsIterator():
def __init__(self):
self.current = 1
def __next__(self):
result = self.current
self.current += 2
return result
def __iter__(self):
return self
class EvenNaturalsIterator():
def __init__(self):
self.current = 0
def __next__(self):
result = self.current
self.current += 2
return result
def __iter__(self):
return EvenNaturalsIterator()
Use OK to test your knowledge with the following conceptual questions:
python3 ok -q odds_evens -u
>>> odds = OddNaturalsIterator()
>>> odd_iter1 = iter(odds)
>>> odd_iter2 = iter(odds)
>>> next(odd_iter1)
______1
>>> next(odd_iter1)
______3
>>> next(odd_iter1)
______5
>>> next(odd_iter2)
______7
>>> next(odd_iter1)
______9
>>> next(odd_iter2)
______11
>>> evens = EvenNaturalsIterator()
>>> even_iter1 = iter(evens)
>>> even_iter2 = iter(evens)
>>> next(even_iter1)
______0
>>> next(even_iter1)
______2
>>> next(even_iter1)
______4
>>> next(even_iter2)
______0
>>> next(even_iter2)
______2
Question 4: Str
Write an iterator that takes a string as input and outputs the letters in order when iterated over.
class Str:
def __init__(self, str):
self.str = str
def __iter__(self):
return iter(self.str)
That works (why?), but just kidding.
class Str:
"""
>>> s = Str("hello")
>>> for char in s:
... print(char)
...
h
e
l
l
o
>>> for char in s: # a standard iterator does not restart
... print(char)
"""
"*** YOUR CODE HERE ***"
def __init__(self, str):
self.str = str
self.i = -1
def __iter__(self):
return self
def __next__(self):
self.i += 1
if self.i >= len(self.str):
raise StopIteration
return self.str[self.i]
Use OK to test your code:
python3 ok -q Str
Generators
A generator function returns a special type of iterator called
a generator object. Such functions can be written using a yield
statement:
def <generator_fn_name>():
<variable> = <value>
while <predicate>:
yield <value>
<increment variable>
Calling a generator function (a function with a yield statement in it) makes it return a generator object rather than executing the body of the function.
The reason we say a generator object is a special type of iterator is that it has all the properties of an iterator, meaning that:
- Calling the
__iter__
method makes a generator object return itself without modifying its current state. - Calling the
__next__
method makes a generator object compute and return the next object in its sequence. If the sequence is exhausted,StopIteration
is raised. - Typically, a generator should not restart unless it's defined that way. But
calling the generator function returns a brand new generator object (like
calling
__iter__
on an iterable object).
However, they do have some fundamental differences:
- An iterator is a class with
__next__
and__iter__
explicitly defined, but a generator can be written as a mere function with ayield
in it. __iter__
in an iterator usesreturn
, but a generator usesyield
.A generator "remembers" its state for the next
__next__
call. Therefore,the first
__next__
call works like this:- Enter the function, run until the line with
yield
. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
- Enter the function, run until the line with
And subsequent
__next__
calls work like this:- Re-enter the function, start at the line after
yield
, and run until the nextyield
statement. - Return the value in the
yield
statement, but remember the state of the function for future__next__
calls.
- Re-enter the function, start at the line after
Question 5: WWPP: Generators
Use OK to test your knowledge with the following What would Python print questions:
python3 ok -q generators -u
def generator():
print("Starting here")
i = 0
while i < 6:
print("Before yield")
yield i
print("After yield")
i += 1
>>> g = generator()
>>> g # what type of object is this?
______<generator object>
>>> g == iter(g) # equivalent of g.__iter__()
______True
>>> next(g) # equivalent of g.__next__()
______Starting here
Before yield
0
>>> next(g)
______After yield
Before yield
1
>>> next(g)
______After yield
Before yield
2
Trace through the code and make sure you know where and why each statement is printed.
You might have noticed from the Iterators section that IteratorB
, which didn't
define a __next__
method, failed to run in the for
loop. However, this is
not always the case.
class IterGen:
def __init__(self):
self.start = 5
def __iter__(self):
while self.start < 10:
self.start += 1
yield self.start
for i in IterGen():
print(i)
Why does this iterable work without defining a __next__
method?
The for
loop only expects the object returned by __iter__
to have a
__next__
method. The __iter__
method is a generator function because of the
yield
statement in the body. Therefore, when __iter__
is called, it returns
a generator object, which you can call __next__
on.
Question 6: Countdown
Write a generator that counts down to 0.
Write it in both ways: using a generator function on its own, and
within the __iter__
method of a class.
def countdown(n):
"""
>>> from types import GeneratorType
>>> type(countdown(0)) is GeneratorType # countdown is a generator
True
>>> for number in countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
while n >= 0:
yield n
n = n - 1
Use OK to test your code:
python3 ok -q countdown
class Countdown:
"""
>>> from types import GeneratorType
>>> type(Countdown(0)) is GeneratorType # Countdown is an iterator
False
>>> for number in Countdown(5):
... print(number)
...
5
4
3
2
1
0
"""
"*** YOUR CODE HERE ***"
def __init__(self, cur):
self.cur = cur
def __iter__(self):
return self
def __next__(self):
result = self.cur
if result < 0:
raise StopIteration
self.cur -= 1
return result
# Alternate solution that makes __iter__ a generator function:
class Countdown:
def __init__(self, cur):
self.cur = cur
def __iter__(self):
while self.cur >= 0:
yield self.cur
self.cur -= 1
Hint: Is it possible to not have a __next__
method in Countdown
?
Use OK to test your code:
python3 ok -q Countdown
Question 7: Hailstone
Write a generator that outputs the hailstone sequence from homework 1.
Here's a quick remainder of how the hailstone sequence is defined:
- Pick a positive integer
n
as the start. - If
n
is even, divide it by 2. - If
n
is odd, multiply it by 3 and add 1. - Continue this process until
n
is 1.
def hailstone(n):
"""
>>> for num in hailstone(10):
... print(num)
...
10
5
16
8
4
2
1
"""
"*** YOUR CODE HERE ***"
i = n
while i > 1:
yield i
if i % 2 == 0:
i //= 2
else:
i = i * 3 + 1
yield i
Use OK to test your code:
python3 ok -q hailstone
Extra Questions
Question 8: Scale
Implement the generator function scale(s, k)
, which yields elements of the
given iterable s
, scaled by k
.
def scale(s, k):
"""Yield elements of the iterable s scaled by a number k.
>>> s = scale([1, 5, 2], 5)
>>> type(s)
<class 'generator'>
>>> list(s)
[5, 25, 10]
>>> m = scale(naturals(), 2)
>>> [next(m) for _ in range(5)]
[2, 4, 6, 8, 10]
"""
"*** YOUR CODE HERE ***"
for elem in s:
yield elem * k
Use OK to test your code:
python3 ok -q scale
Question 9: Merge
Implement merge(s0, s1)
, which takes two iterables s0
and s1
whose
elements are ordered. merge
yields elements from s0
and s1
in sorted
order, eliminating repetition. You may assume s0
and s1
themselves do not
contain repeats. You may also assume s0
and s1
represent infinite
sequences; that is, their iterators never raise StopIteration
.
See the doctests for example behavior.
def merge(s0, s1):
"""Yield the elements of strictly increasing iterables s0 and s1, removing
repeats. Assume that s0 and s1 have no repeats. You can also assume that s0
and s1 represent infinite sequences.
>>> twos = scale(naturals(), 2)
>>> threes = scale(naturals(), 3)
>>> m = merge(twos, threes)
>>> type(m)
<class 'generator'>
>>> [next(m) for _ in range(10)]
[2, 3, 4, 6, 8, 9, 10, 12, 14, 15]
"""
i0, i1 = iter(s0), iter(s1)
e0, e1 = next(i0), next(i1)
"*** YOUR CODE HERE ***"
while True:
yield min(e0, e1)
if e0 < e1:
e0 = next(i0)
elif e1 < e0:
e1 = next(i1)
else:
e0, e1 = next(i0), next(i1)
Use OK to test your code:
python3 ok -q merge
Question 10: Remainder generator
Like functions, generators can also be higher-order. For this
problem, we will be writing remainders_generator
, which yields a
series of generator objects.
remainders_generator
takes in an integer m
, and yields m
different
generators. The first generator is a generator of multiples of m
, i.e. numbers
where the remainder is 0. The second, a generator of natural numbers with
remainder 1 when divided by m
. The last generator yield natural numbers with
remainder m - 1
when divided by m
.
def remainders_generator(m):
"""
Takes in an integer m, and yields m different remainder groups
of m.
>>> remainders_mod_four = remainders_generator(4)
>>> for rem_group in remainders_mod_four:
... for _ in range(3):
... print(next(rem_group))
0
4
8
1
5
9
2
6
10
3
7
11
"""
"*** YOUR CODE HERE ***"
def remainder_group(rem):
start = rem
while True:
yield start
start += m
for rem in range(m):
yield remainder_group(rem)
Note that if you have implemented this correctly, each of the
generators yielded by remainder_generator
will be infinite - you
can keep calling next
on them forever without running into a
StopIteration
exception.
Hint: Consider defining an inner generator function. What arguments should it take in? Where should you call it?
Use OK to test your code:
python3 ok -q remainders_generator
Question 11: Zip generator
For this problem, we will be writing zip_generator
, which yields a
series of lists, each containing the nth items of each iterable.
It should stop when the smallest iterable runs out of elements.
def zip(*iterables):
"""
Takes in any number of iterables and zips them together.
Returns a generator that outputs a series of lists, each
containing the nth items of each iterable.
>>> z = zip([1, 2, 3], [4, 5, 6], [7, 8])
>>> for i in z:
... print(i)
...
[1, 4, 7]
[2, 5, 8]
"""
"*** YOUR CODE HERE ***"
iterators = [iter(iterable) for iterable in iterables]
while True:
yield [next(iterator) for iterator in iterators]
Use OK to test your code:
python3 ok -q zip