Lab 5: Trees, Data Abstraction
Due by 11:59pm on Wednesday, September 27.
Starter Files
Download lab05.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the Ok autograder.
Required Questions
Getting Started Videos
These videos may provide some helpful direction for tackling the coding problems on this assignment.
To see these videos, you should be logged into your berkeley.edu email.
Dictionaries
Consult the drop-down if you need a refresher on dictionaries. It's okay to skip directly to the questions and refer back here should you get stuck.
Lists are ordered collections that allow us to access values by index. Another type of collection, called a dictionary, allows us to store and access values that correspond to given keys.
For example, the following code constructs a dictionary with three entries
and assigns it to the name artists
. Note that you could write the same code all
on one line.
>>> artists = {
... 'Ariana Grande': 'No Tears Left To Cry',
... 'Marshmello': 'FRIENDS',
... 'Migos': ['Stir Fry', 'Walk It Talk It']
... }
We say that a dictionary is an unordered collection of key-value pairs. Each key-value pair is separated by a comma. For each pair, the key appears to the left of the colon and the value appears to the right of the colon. To retrieve values from the dictionary, use bracket notation with the corresponding key:
>>> artists['Ariana Grande']
'No Tears Left To Cry'
>>> songs = artists['Migos']
>>> songs[0]
'Stir Fry'
Keys/values in dictionaries do not all have to be the same type:
>>> d = {1: "hello", "hi": 3, (3, 5): 2}
>>> d[(3,5)]
2
However, keys can only be immutable types (e.g. strings, numbers, tuples). For example, if a list is used as a key, an error is raised:
>>> d = {[1]: "test"}
TypeError
Of course, keys must be unique in a dictionary. While Python does not raise an error when you do this, duplicate keys make it impossible to access parts of your data:
>>> d = {"fun": 1, "fun": 2}
>>> d["fun"] # Is 1 or 2 output?
You can check whether a dictionary has a key using the in
operator,
like we can do with lists.
>>> 'Migos' in artists
True
>>> 'Wolves' in artists # Not True for values!
False
Finally, here are some useful dictionary functions:
dict.keys()
will return a sequence of keys.>>> list(artists.keys()) # We use list() to turn the sequence into a list ['Ariana Grande', 'Marshmello', 'Migos', 'Charlie Puth']
dict.values()
will return a sequence of values.>>> list(artists.values()) ['No Tears Left To Cry', 'Wolves', ['Stir Fry', 'Walk It Talk It'], 'Attention']
dict.items()
will return a sequence of key-value tuples.>>> list(artists.items()) [('Ariana Grande', 'No Tears Left To Cry'), ('Marshmello', 'Wolves'), ('Migos', ['Stir Fry', 'Walk It Talk It']), ('Charlie Puth', 'Attention')]
Q1: Dictionaries
Use Ok to test your knowledge with the following "What Would Python Display?" questions:
python3 ok -q pokemon -u
>>> pokemon = {'pikachu': 25, 'dragonair': 148, 'mew': 151}
>>> pokemon['pikachu']
______25
>>> len(pokemon)
______3
>>> 'mewtwo' in pokemon
______False
>>> 'pikachu' in pokemon
______True
>>> 25 in pokemon
______False
>>> 148 in pokemon.values()
______True
>>> 151 in pokemon.keys()
______False
>>> 'mew' in pokemon.keys()
______True
Trees
Consult the drop-down if you need a refresher on trees. It's okay to skip directly to the questions and refer back here should you get stuck.
A tree
is a data structure that represents a hierarchy of information. A file system is a good example of a tree structure. For example, within your cs61a
folder, you have folders separating your projects
, lab
assignments, and homework
. The next level is folders that separate different assignments, hw01
, lab01
, hog
, etc., and inside those are the files themselves, including the starter files and ok
. Below is an incomplete diagram of what your cs61a
directory might look like.
As you can see, unlike trees in nature, the tree abstract data type is drawn with the root at the top and the leaves at the bottom.
Tree ADT
Our tree
abstract data type consists of a root and a list of its
branches
. To create a tree and access its root value and branches, use the
following interface of constructor and selectors:
Constructor
tree(label, branches=[])
: creates a tree object with the givenlabel
value at its root node and list ofbranches
. Notice that the second argument to this constructor,branches
, is optional — if you want to make a tree with no branches, leave this argument empty.
Selectors
label(tree)
: returns the value in the root node oftree
.branches(tree)
: returns the list of branches of the giventree
(each of which is also a tree)
Convenience function
is_leaf(tree)
: returnsTrue
iftree
's list ofbranches
is empty, andFalse
otherwise.
For example, the tree generated by
number_tree = tree(1,
[tree(2),
tree(3,
[tree(4),
tree(5)]),
tree(6,
[tree(7)])])
would look like this:
1
/ | \
2 3 6
/ \ \
4 5 7
To extract the number 3
from this tree, which is the label of the root of its second branch, we would do this:
label(branches(number_tree)[1])
The lab file contains the implementation of the tree ADT, if you would like to view it. However, as with any data abstraction, we should only concern ourselves with what our functions do rather than their specific implementation! You do not need to look at the implementation of the tree ADT for this problem, and your solutions should not depend on the specifics of our implementation beyond what is specified in the interface above.
Q2: Finding Berries!
The squirrels on campus need your help! There are a lot of trees on campus and
the squirrels would like to know which ones contain berries. Define the function
berry_finder
, which takes in a tree and returns True
if the tree contains a
node with the value 'berry'
and False
otherwise.
Hint: To iterate through each of the branches of a particular tree, you can consider using a
for
loop to get each branch.
def berry_finder(t):
"""Returns True if t contains a node with the value 'berry' and
False otherwise.
>>> scrat = tree('berry')
>>> berry_finder(scrat)
True
>>> sproul = tree('roots', [tree('branch1', [tree('leaf'), tree('berry')]), tree('branch2')])
>>> berry_finder(sproul)
True
>>> numbers = tree(1, [tree(2), tree(3, [tree(4), tree(5)]), tree(6, [tree(7)])])
>>> berry_finder(numbers)
False
>>> t = tree(1, [tree('berry',[tree('not berry')])])
>>> berry_finder(t)
True
"""
"*** YOUR CODE HERE ***"
Use Ok to test your code:
python3 ok -q berry_finder
Q3: Replace Loki at Leaf
Define replace_loki_at_leaf
, which takes a tree t
and a value lokis_replacement
. replace_loki_at_leaf
returns a new tree that's the same as t
except
that every leaf label equal to "loki"
has been replaced with lokis_replacement
.
If you want to learn about the Norse mythology referenced in this problem, you can read about it here.
def replace_loki_at_leaf(t, lokis_replacement):
"""Returns a new tree where every leaf value equal to "loki" has
been replaced with lokis_replacement.
>>> yggdrasil = tree('odin',
... [tree('balder',
... [tree('loki'),
... tree('freya')]),
... tree('frigg',
... [tree('loki')]),
... tree('loki',
... [tree('sif'),
... tree('loki')]),
... tree('loki')])
>>> laerad = copy_tree(yggdrasil) # copy yggdrasil for testing purposes
>>> print_tree(replace_loki_at_leaf(yggdrasil, 'freya'))
odin
balder
freya
freya
frigg
freya
loki
sif
freya
freya
>>> laerad == yggdrasil # Make sure original tree is unmodified
True
"""
"*** YOUR CODE HERE ***"
Use Ok to test your code:
python3 ok -q replace_loki_at_leaf
Data Abstraction
Consult the drop-down if you need a refresher on data abstraction. It's okay to skip directly to the questions and refer back here should you get stuck.
So far, we have encountered several data types: numbers, strings, booleans, lists, tuples, and dictionaries. But what if we want to represent more complicated things — such as the objects we encounter in real life?
Enter the idea of data abstraction, which allows us to build and use objects in code that represent a compound value.
An abstract data type consists of two types of functions:
- Constructors: functions that build instances of the abstract data type.
- Selectors: functions that retrieve information from instances of the ADT.
An integral part of data abstraction is the concept of abstraction. Users of the abstract data type have access to an interface of defined actions (i.e. constructors and selectors) they can take with the ADT. The user of the ADT does not need to know how the constructors and selectors are implemented. The nature of abstraction allows whoever uses them to assume that the functions have been written correctly and work as described.
This is called the abstraction barrier!
In fact, interacting within an ADT outside of its interface of constructors and selectors is called "violating the abstraction barrier" and is strongly discouraged (even when it doesn't produce an error).
In this way, data abstraction mimics how we think about the world. For example, a car is a complicated machine, but it has a simple interface by which we interact with it: the wheel, the gas pedal, the gear shift. If you want to drive a car, you don't need to know how the engine was built or what kind of material the tires are made of. It's easy to drive a car because of the suppression of complexity that abstraction provides us.
For the car, the principle of the abstraction barrier also applies. If we want to start the car, we should use the provided interface of turning the ignition key, rather than getting out, popping the hood, and messing around with the engine. While you could likely do that if you had enough knowledge, it is unnecessarily costly and highly dependent on how exactly the car was constructed. What are you going to do if the car is electric?
When you're writing code using ADTs that have been provided for you, make sure that you're never assuming anything about the ADT other than that it can be constructed with its constructor and its information can be accessed by selectors. Your code should never depend on the specific implementation of the ADT under the hood.
City ADT
Say we have an abstract data type for cities. A city has a name, a latitude coordinate, and a longitude coordinate.
Our data abstraction has one constructor:
make_city(name, lat, lon)
: Creates a city object with the given name, latitude, and longitude.
We also have the following selectors in order to get the information for each city:
get_name(city)
: Returns the city's nameget_lat(city)
: Returns the city's latitudeget_lon(city)
: Returns the city's longitude
Here is how we would use the constructor and selectors to create cities and extract their information:
>>> berkeley = make_city('Berkeley', 122, 37)
>>> get_name(berkeley)
'Berkeley'
>>> get_lat(berkeley)
122
>>> new_york = make_city('New York City', 74, 40)
>>> get_lon(new_york)
40
All of the selector and constructor functions can be found in the lab file if you are curious to see how they are implemented. However, the point of data abstraction is that — as users of the city ADT — we do not need to know how an the ADT is implemented. You do not need to look at the implementation of the city ADT for this problem, and your solutions should not depend on the specifics of our implementation beyond what is specified in the interface above.
Q4: Distance
We will now implement the function distance
, which computes the
distance between two city objects. Recall that the distance between two
coordinate pairs (x1, y1)
and (x2, y2)
can be found by calculating
the sqrt
of (x1 - x2)**2 + (y1 - y2)**2
. We have already imported
sqrt
for your convenience. Use the latitude and longitude of a city as
its coordinates; you'll need to use the selectors to access this info!
from math import sqrt
def distance(city_a, city_b):
"""
>>> city_a = make_city('city_a', 0, 1)
>>> city_b = make_city('city_b', 0, 2)
>>> distance(city_a, city_b)
1.0
>>> city_c = make_city('city_c', 6.5, 12)
>>> city_d = make_city('city_d', 2.5, 15)
>>> distance(city_c, city_d)
5.0
"""
"*** YOUR CODE HERE ***"
Use Ok to test your code:
python3 ok -q distance
Q5: Closer City
Next, implement closer_city
, a function that takes a latitude,
longitude, and two cities, and returns the name of the city that is
closer to the provided latitude and longitude.
You may only use the selectors and constructors introduced above and the
distance
function you just defined for this question.
Hint: How can you use your
distance
function to find the distance between the given location and each of the given cities?
def closer_city(lat, lon, city_a, city_b):
"""
Returns the name of either city_a or city_b, whichever is closest to
coordinate (lat, lon). If the two cities are the same distance away
from the coordinate, consider city_b to be the closer city.
>>> berkeley = make_city('Berkeley', 37.87, 112.26)
>>> stanford = make_city('Stanford', 34.05, 118.25)
>>> closer_city(38.33, 121.44, berkeley, stanford)
'Stanford'
>>> bucharest = make_city('Bucharest', 44.43, 26.10)
>>> vienna = make_city('Vienna', 48.20, 16.37)
>>> closer_city(41.29, 174.78, bucharest, vienna)
'Bucharest'
"""
"*** YOUR CODE HERE ***"
Use Ok to test your code:
python3 ok -q closer_city
Q6: Don't violate the abstraction barrier!
Note: this question has no code-writing component (if you implemented the previous two questions correctly).
When writing functions that use a data abstraction, we should use the constructor(s) and selector(s) whenever possible instead of assuming the data abstraction's implementation. Relying on a data abstraction's underlying implementation is known as violating the abstraction barrier, and we never want to do this!
It's possible that you passed the doctests for the previous questions even if you violated the abstraction barrier. To check whether or not you did so, run the following command:
Use Ok to test your code:
python3 ok -q check_city_abstraction
The check_city_abstraction
function exists only for the doctest, which swaps
out the implementations of the original abstraction with something else, runs
the tests from the previous two parts, then restores the original abstraction.
The nature of the abstraction barrier guarantees that changing the implementation of an data abstraction shouldn't affect the functionality of any programs that use that data abstraction, as long as the constructors and selectors were used properly.
If you passed the Ok tests for the previous questions but not this one, the fix is simple! Just replace any code that violates the abstraction barrier with the appropriate constructor or selector.
Make sure that your functions pass the tests with both the first and the second implementations of the data abstraction and that you understand why they should work for both before moving on.
Check Your Score Locally
You can locally check your score on each question of this assignment by running
python3 ok --score
This does NOT submit the assignment! When you are satisfied with your score, submit the assignment to Gradescope to receive credit for it.
Submit
Make sure to submit this assignment by uploading any files you've edited to the appropriate Gradescope assignment. For a refresher on how to do this, refer to Lab 00.
Optional Questions
These questions are optional, but you must complete them in order to be checked off before the end of the lab period. They are also useful practice!
Trees
Q7: Recursion on Trees
Define a function dejavu
, which takes in a tree of numbers t
and a number n
. It returns True
if there is a path from the root to a leaf such that the sum of the numbers along that path is n
and False
otherwise.
def dejavu(t, n):
"""
>>> my_tree = tree(2, [tree(3, [tree(5), tree(7)]), tree(4)])
>>> dejavu(my_tree, 12) # 2 -> 3 -> 7
True
>>> dejavu(my_tree, 5) # Sums of partial paths like 2 -> 3 don ’t count
False
"""
"*** YOUR CODE HERE ***"
Use Ok to test your code:
python3 ok -q dejavu
Q8: Hailstone Tree
We can represent the hailstone sequence as a tree in the figure below,
showing the route different numbers take to reach 1. Remember that a hailstone
sequence starts with a number n
, continuing to n/2
if n
is even or 3n+1
if n
is odd, ending with 1. Write a function hailstone_tree(n, h)
which generates a tree of height h
, containing hailstone numbers that
will reach n
.
Hint: A node of a hailstone tree will always have at least one, and at most two branches (which are also hailstone trees). Under what conditions do you add the second branch?
def hailstone_tree(n, h):
"""Generates a tree of hailstone numbers that will reach N, with height H.
>>> print_tree(hailstone_tree(1, 0))
1
>>> print_tree(hailstone_tree(1, 4))
1
2
4
8
16
>>> print_tree(hailstone_tree(8, 3))
8
16
32
64
5
10
"""
if _________________________________:
return _________________________________
branches = _________________________________
if ___________ and ___________ and ___________:
branches += _________________________________
return tree(n, branches)
Use Ok to test your code:
python3 ok -q hailstone_tree