Lab 11: Streams, Sets, and Binary Trees
Due at 11:59pm on Friday, 04/14/2017.
Starter Files
Download lab11.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the OK autograder.
Submission
By the end of this lab, you should have submitted the lab with
python3 ok --submit
. You may submit more than once before the
deadline; only the final submission will be graded.
- To receive credit for this lab, you must complete Questions 1, 2, 3, and 4 in lab11.py and submit through OK.
- Questions 5, 6, 7, 8, and 9 are extra practice. They can be found in the lab11_extra.py file. It is recommended that you complete these problems on your on time.
Streams
The Stream
class defines a lazy sequence, a lazily
evaluated linked list. In other words, a Stream's elements (except for the
first element) are only evaluated as those values are needed.
class Stream:
"""A lazily computed linked list."""
class empty:
"""The empty stream."""
def __repr__(self):
return 'Stream.empty'
empty = empty()
# As a convenience, we allow the second arguemt to Stream to be
# another Stream (without having to have a lambda: in front.)
def __init__(self, first, compute_rest=empty):
"""A stream whose first element is FIRST and whose tail is either
a stream or stream-returning parameterless function COMPUTE_REST."""
self.first = first
if compute_rest is Stream.empty or isinstance(compute_rest, Stream):
self._rest, self._compute_rest = compute_rest, None
else:
assert callable(compute_rest), 'compute_rest must be callable.'
self._compute_rest = compute_rest
@property
def rest(self):
"""Return the rest of the stream, computing it if necessary."""
if self._compute_rest is not None:
self._rest = self._compute_rest()
self._compute_rest = None
return self._rest
def __repr__(self):
return 'Stream({0}, <...>)'.format(repr(self.first))
Instead of specifying all of the elements in __init__
, we
provide a function, compute_rest
, that encapsulates the algorithm
used to calculate the remaining elements of the stream.
The code in the function body is not evaluated until it is called,
which lets us implement the desired evaluation behavior.
This implementation of streams also uses memoization. The first time
a program asks a Stream
for its rest
field, the Stream
code
computes the required value using compute_rest
, saves the resulting
value, and then returns it. After that, every time the rest
field is
referenced, the stored value is simply returned and it is not computed
again.
Here is an example (which you may use in solutions):
def make_integer_stream(first=1):
def compute_rest():
return make_integer_stream(first+1)
return Stream(first, compute_rest)
We start out with a stream whose first
element is 1, and whose compute_rest
function creates another stream.
So when we do compute the rest
, we get another stream whose first
element is one greater than the previous element, and whose
compute_rest
creates another stream. Hence, we effectively get an
infinite stream of integers, computed one at a time. This is almost
like an infinite recursion, but one which can be viewed one step at a
time, and so does not crash.
Another example:
def map_stream(fn, s):
if s is Stream.empty:
return s
return Stream(fn(s.first), lambda: map_stream(fn, s.rest))
Question 1: Interleave
Define a function interleave
, which takes in two streams:
a1, a2, a3, ...
b1, b2, b3, ...
and returns the new stream
a1, b1, a2, b2, a3, b3, ...
If either of the inputs is finite, the output stream should be finite as well, terminating just at the point where it would be impossible to go on. If both of the inputs are infinite, the output stream should be infinite as well.
def interleave(stream1, stream2):
"""Return a stream with alternating values from stream1 and stream2.
>>> ints = make_integer_stream(1)
>>> fib = make_fib_stream()
>>> alternating = interleave(ints, fib)
>>> alternating.first
1
>>> alternating.rest.first
0
>>> alternating.rest.rest.first
2
>>> alternating.rest.rest.rest.first
1
"""
"*** YOUR CODE HERE ***"
if stream1 is Stream.empty:
return Stream.empty
return Stream(stream1.first, lambda: interleave(stream2, stream1.rest))
Use OK to test your code:
python3 ok -q interleave
Question 2: Add streams
Write a procedure add_streams
that takes in two streams s1
, s2
,
and returns a new stream that is the result of adding elements
from s1
to elements from s2
. For instance, if s1
was (1, 2, 3,
...)
and s2
was (2, 4, 6, ...)
, then the output would be the
stream (3, 6, 9, ...)
. You can assume that both Streams are
infinite.
def add_streams(s1, s2):
"""Returns a stream that is the sum of s1 and s2.
>>> stream1 = make_integer_stream()
>>> stream2 = make_integer_stream(9)
>>> added = add_streams(stream1, stream2)
>>> added.first
10
>>> added.rest.first
12
>>> added.rest.rest.first
14
"""
"*** YOUR CODE HERE ***"
def compute_rest():
return add_streams(s1.rest, s2.rest)
return Stream(s1.first + s2.first, compute_rest)
Use OK to test your code:
python3 ok -q add_streams
Sets
A set is an unordered collection of distinct objects that supports membership testing, union, intersection, and adjunction. The main differences between sets and lists are that sets are unordered and contain no duplicates. Other than that, almost everything is the same.
>>> a = [1, 1, 2, 2, 3, 3]
>>> a = set(a)
>>> a # No duplicates
{1, 2, 3}
>>> a = {3, 1, 2}
>>> a # Not necessarily in same order
{1, 2, 3}
The Python documentation on
sets has more
details. The main things you will use with sets include: in
, union
(|
),
intersection
(&
), and difference
(-
).
One really convenient thing about Python sets is that many operations on sets
(adding elements, removing elements, checking membership) run in θ(1)
(constant) time (usually).
Some of the problems use a utility method called timeit
, which takes a
parameterless function as argument, executes it, and returns the time required
to do so. It's a variation on the function timeit.timeit
function in the
Python3 library.
Question 3: WWPD: Sets
Use OK to test your knowledge with the following "What Would Python Display?" questions:
python3 ok -q sets -u
>>> a = [1, 1, 2, 2, 3, 3]
>>> a = set(a)
>>> len(a)
______3
>>> sorted(a) # sorted(iterable) returns a new sorted list
______[1, 2, 3]
>>> a.add(4)
>>> a.add(4)
>>> a.remove(4)
>>> 4 in a
______False
>>> a = {1, 4, 12, 1000}
>>> sum(a)
______1017
>>> b = {1, 2, 4}
>>> sorted(a.intersection(b))
______[1, 4]
>>> sorted(a & b)
______[1, 4]
>>> sorted(a.union(b))
______[1, 2, 4, 12, 1000]
>>> sorted(a | b)
______[1, 2, 4, 12, 1000]
>>> sorted(a - b)
______[12, 1000]
>>> fruits = set(['apple', 'banana', 'tomato', 'apple'])
>>> pizza = set(['cheese', 'tomato', 'flour'])
>>> 'pepperoni' in pizza
______False
>>> fruits & pizza
______{'tomato'}
>>> t = [314, 15]
>>> u = {89, 7, 15}
>>> sorted(set(t) | u)
______[7, 15, 89, 314]
>>> u.add(6)
>>> set(t) - u
______{314}
Question 4: Find duplicates
Write the following function so it runs in O(n) time.
def find_duplicates(lst):
"""Returns True if lst has any duplicates and False if it does not.
>>> find_duplicates([1, 2, 3, 4, 5])
False
>>> find_duplicates([1, 2, 3, 4, 2])
True
>>> find_duplicates(list(range(100000)) + [-1]) # make sure you have linear time
False
"""
"*** YOUR CODE HERE ***"
return len(set(lst)) != len(lst)
Use OK to test your code:
python3 ok -q find_duplicates
Extra Questions: Binary Trees
The following problems deal with binary trees, in which each node has at most two children. We further modify the definition of tree to allow empty trees, which we haven't dealt with up to now. So here, we define a binary tree as either
- An empty tree, or
- A label and two children, called the left and right child respectively, both of which are binary trees. The left and right children, therefore, are the roots of the left and right subtrees.
As a result of this definition, most recursion on binary trees uses the empty tree as a base case.
Our implementation of binary trees is the following:
# Tree definition
class BinTree:
empty = ()
def __init__(self, label, left=empty, right=empty):
self.label = label
self.left = left
self.right = right
def __repr__(self):
if self.left == self.empty and self.right == self.empty:
return 'BinTree({})'.format(repr(self.label))
left = 'BinTree.empty' if self.left == self.empty else repr(self.left)
right = 'BinTree.empty' if self.right == self.empty else repr(self.right)
return 'BinTree({}, {}, {})'.format(repr(self.label), left, right)
def binTree(tupleTree):
"""A convenience method for succinctly constructing binary trees. The
empty tuple or list stands for BinTree.empty; (V,) or [V] stands
for BinTree(V); (V, T1, T2) or [V, T1, T2] stands for
BinTree(V, binTree(T1), binTree(T2)).
>>> binTree((3,
... (1, (), [2]),
... (6, [5], [7])))
BinTree(3, BinTree(1, BinTree.empty, BinTree(2)), BinTree(6, BinTree(5), BinTree(7)))
"""
if len(tupleTree) == 0:
return BinTree.empty
elif len(tupleTree) == 1:
return BinTree(tupleTree[0])
else:
return BinTree(tupleTree[0], binTree(tupleTree[1]), binTree(tupleTree[2]))
We also included a function print_bintree
, which prints out a string
representation of a tree:
>>> print_bintree(binTree( (1, (3, (), [2]), (4, [5], [6])) ))
-1-
/ \
3 4
\ / \
2 5 6
Question 5: Size
Define the function size
which takes in a BinTree
as an argument and
returns the number of entries in the tree.
def size(tree):
""" Return the number of entries in the binary tree b.
>>> b = BinTree(4,
... BinTree(2,
... BinTree(1)),
... BinTree(6,
... BinTree(5)))
>>> size(b)
5
"""
"*** YOUR CODE HERE ***"
if tree is BinTree.empty:
return 0
return 1 + size(tree.left) + size(tree.right)
Use OK to test your code:
python3 ok -q size
A binary search tree (or BST) is a binary tree that obeys the following
constraint:
- All labels in the left subtree have a value less than or equal to the label of v, and
- All labels in the right subtree have a value greater than or equal to the label of v.
- This is true for its children subtree and their grandchildren (and so on) as well.
Thus, these are BSTs:
--4-- 1 4 4
/ \ \ / /
2 6 2 3 1
/ \ / \ / \
1 3 5 3 2 3
\ / /
4 1 2
These are binary trees, but not BSTs:
--6-- 4
/ \ \
2 4 3
/ \ / \
1 3 5 2
\
1
Question 6: Contains
Write a function contains
that takes two arguments, bst
and
elem
, and returns True
if elem
is in the tree and
False
if it is not.
def contains(bst, elem):
"""
>>> b = BinTree(5, BinTree(3, BinTree(2), BinTree(4)), BinTree(6))
>>> contains(b, 5)
True
>>> contains(b, 6)
True
>>> contains(b, 7)
False
"""
"*** YOUR CODE HERE ***"
if not bst:
return False
if bst.label == elem:
return True
if bst.label > elem:
return contains(bst.left, elem)
return contains(bst.right, elem)
Use OK to test your code:
python3 ok -q contains
More Extra Questions
Question 7: Unique
Implement a function unique
that takes a stream and returns a stream in
which duplicates of any value are filtered out. This should work for
infinite streams as well as finite ones.
def unique(s):
"""Return a stream of the unique elements in s in the order that they
first appear.
>>> s = unique(lst_to_stream([1, 2, 2, 1, 0, 4, 2, 3, 1, 9, 0]))
>>> s.first
1
>>> s.rest.first
2
>>> s.rest.rest.rest.rest.rest.first
9
"""
"*** YOUR CODE HERE ***"
seen = set()
def make_unique(s):
def rest():
return make_unique(s.rest)
if s is Stream.empty:
return s
elif s.first in seen:
return rest()
else:
seen.add(s.first)
return Stream(s.first, rest)
return make_unique(s)
Use OK to test your code:
python3 ok -q unique
Question 8: Intersection
Implement the intersection function for two sets. Intersection takes in two sets and returns a new set of only the elements in both sets.
def intersection(s1, s2):
"""Returns the intersection of two sets.
>>> r = {0, 1, 4, 0}
>>> s = {1, 2, 3, 4}
>>> t = intersection(s, {3, 4, 2})
>>> t
{2, 3, 4}
>>> intersection(r, t)
{4}
"""
"*** YOUR CODE HERE ***"
intersection_set = set()
for elem in s1:
if elem in s2:
intersection_set.add(elem)
return intersection_set
Use OK to test your code:
python3 ok -q intersection
Question 9: Union
Implement the union function for sets. Union takes in two sets, and returns a new set with elements from the first set, and all other elements that have not already have been seen in the second set.
def union(s1, s2):
"""Returns the union of two sets.
>>> r = {0, 6, 6}
>>> s = {1, 2, 3, 4}
>>> t = union(s, {1, 6})
>>> t
{1, 2, 3, 4, 6}
>>> union(r, t)
{0, 1, 2, 3, 4, 6}
"""
"*** YOUR CODE HERE ***"
union_set = set()
for elem in s1:
union_set.add(elem)
for elem in s2:
union_set.add(elem)
return union_set
Use OK to test your code:
python3 ok -q union