Lab 4: Mutability and Data Abstraction

lab04.zip

Due at 11:59pm on Friday, 07/06/2018.

Lab Check-in 3 questions here.

Starter Files

Download lab04.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the Ok autograder.

Submission

By the end of this lab, you should have submitted the lab with python3 ok --submit. You may submit more than once before the deadline; only the final submission will be graded. Check that you have successfully submitted your code on okpy.org.

To receive credit for this lab, you must complete Questions 1-4 in lab04.py and submit through OK.
Questions 5-9 are optional. They can be found in the lab04_extra.py file. It is recommended that you complete these problems on your own time.

Topics

Consult this section if you need a refresher on the material for this lab. It's okay to skip directly to the questions and refer back here should you get stuck.

Mutability

We say that an object is mutable if its state can change as code is executed. The process of changing an object's state is called mutation. Examples of mutable objects include lists and dictionaries. Examples of objects that are not mutable include tuples and functions.

Lists

Lists are mutable because we can add, change, or remove elements.

We know that we can index into a list to retrieve an element; the same indexing syntax can be used in an assignment statement to change the element at a given position.

>>> lst = [1, 2, 3, 4]
>>> lst[2] = -lst[2]
>>> lst
[1, 2, -3, 4]

There are many built-in methods that we can use to mutate lists. Here are some of the most useful ones:

append(el): Appends el to the end of the list and returns None.
extend(lst): Extends the list by concatenating it with lst and returns None.
remove(el): Removes the first occurence of el from the list and returns None.
insert(i, el): Inserts el at index i and returns None.
pop(i): Removes and returns the element at index i. If no index is given, removes and returns the last element in the list.

To call methods, we use dot notation:
>>> lst = [1, 2, 3, 4]
>>> lst.append(5)
>>> lst.insert(2, 2.5)
The list that we are mutating precedes the . and the method we are calling follows it, using regular call expression syntax to pass in arguments. We will learn more about methods in detail later in the course when we discuss object oriented programming.

Let's see these methods in action:

>>> l = [3, 5, 6]
>>> l.append(10)    # Append 10 to the end of the list
>>> l
[3, 5, 6, 10]
>>> l.extend([30, 40])
>>> l
[3, 5, 6, 10, 30, 40]
>>> l.remove(5)     # Remove the first occurrence of 5
>>> l
[3, 6, 10, 30, 40]
>>> l.insert(2, -2) # Insert -2 at index 2
>>> l
[3, 6, -2, 10, 30, 40]
>>> l.pop()         # Remove and return the last element
40
>>> l
[3, 6, -2, 10, 30]
>>> l.pop(2)        # Remove and return the element at index 2
-2
>>> l
[3, 6, 10, 30]

Take note of two things:

The name l refers to the same list object during this entire session; it is never reassigned. The reason the output looks different each time we call a method is because the list that l evaluates to is being mutated.
The only method here that has a return value is pop! All of the other methods return None.

Here are the above method calls displayed in an environment diagram:

Dictionaries

Lists are ordered collections that allow us to access values by index. Another type of collection, called a dictionary, allows us to store and access values that correspond to given keys.

Dictionaries are unordered sets of key-value pairs. Keys can only be immutable types (e.g. strings, numbers, tuples), but their corresponding value can be anything! The following code constructs a dictionary with three entries and assigns it to the name artists. Note, you could write the same code all on one line.

>>> artists = {
...    'Ariana Grande': 'No Tears Left To Cry',
...    'Marshmello': 'FRIENDS',
...    'Migos': ['Stir Fry', 'Walk It Talk It']
... }

Each key-value pair is separated by a comma. For each pair, the key appears to the left of the colon and the value appears to the right of the colon. Similar to list values, keys/values in dictionaries do not all have to be the same type. However, unlike lists, keys must be unique; that is, a key can appear only once in a dictionary.

To retrieve values from the dictionary, use bracket notation with the corresponding key:

>>> artists['Ariana Grande']
'No Tears Left To Cry'
>>> songs = artists['Migos']
>>> songs[0]
'Stir Fry'

You can add an entry or update an entry for an existing key in the dictionary using the following syntax.

Note that the syntax for adding and for updating a key-value pair are identical, so be careful! You might end updating (and overwriting an old value) even if you intended to add, and vice versa.

>>> artists['Marshmello'] = 'Wolves'
>>> artists['Marshmello']
'Wolves'
>>> artists['Charlie Puth'] = 'Attention' # New entry!
>>> artists['Charlie Puth']
'Attention'

You can also check for membership of keys with the in function, like we can do with lists.

>>> 'Migos' in artists
True
>>> 'Wolves' in artists  # Not True for values!
False

Finally, here are some useful dictionary functions:

dict.keys() will return a sequence of keys.

>>> list(artists.keys()) # We use list() to turn the sequence into a list
['Ariana Grande', 'Marshmello', 'Migos', 'Charlie Puth']

dict.values() will return a sequence of values.

>>> list(artists.values())
['No Tears Left To Cry', 'Wolves', ['Stir Fry', 'Walk It Talk It'], 'Attention']

dict.items() will return a sequence of key-value tuples.

>>> list(artists.items())
[('Ariana Grande', 'No Tears Left To Cry'),
 ('Marshmello', 'Wolves'),
 ('Migos', ['Stir Fry', 'Walk It Talk It']),
 ('Charlie Puth', 'Attention')]

Identity vs. Equality

We have seen how to use the == operator to check if two expressions evaluate to equal values. We now introduce a new comparison operator, is, that checks whether two expressions evaluate to the same values.

Wait, what's the difference? For primitive values, there is none:

>>> 2 + 2 == 3 + 1
True
>>> 2 + 2 is 3 + 1
True

This is because all primitives have the same identity under the hood. However, with non-primitive values, such as lists, each object has its own identity. That means you can construct two objects that may look exactly the same but have different identities.

>>> lst1 = [1, 2, 3, 4]
>>> lst2 = [1, 2, 3, 4]
>>> lst1 == lst2
True
>>> lst1 is lst2
False

Here, although the lists referred to by lst1 and lst2 have equal contents, they are not the same object. In other words, they are the same in terms of equality, but not in terms of identity.

This is important in our discussion of mutability because when we mutate an object, we simply change its state, not its identity.

>>> lst1 = [1, 2, 3, 4]
>>> lst2 = lst1
>>> lst1.append(5)
>>> lst2
[1, 2, 3, 4, 5]
>>> lst1 is lst2
True

Data Abstraction

Data abstraction is a powerful concept in computer science that allows programmers to treat code as objects -- for example, car objects, chair objects, people objects, etc. That way, programmers don't have to worry about how code is implemented -- they just have to know what it does.

Data abstraction mimics how we think about the world. When you want to drive a car, you don't need to know how the engine was built or what kind of material the tires are made of. You just have to know how to turn the wheel and press the gas pedal.

An abstract data type consists of two types of functions:

Constructors: functions that build the abstract data type.
Selectors: functions that retrieve information from the data type.

Programmers design ADTs to abstract away how information is stored and calculated such that the end user does not need to know how constructors and selectors are implemented. The nature of abstract data types allows whoever uses them to assume that the functions have been written correctly and work as described.

Tree Recursion

Note: Tree Recursion is an optional topic for this lab. Do the required questions before looking at this section and the optional Tree Recursion questions.

A tree recursive function is a recursive function that makes more than one call to itself, resulting in a tree-like series of calls.

A classic example of a tree recursion function is finding the nth Fibonacci number:

def fib(n):
    if n == 0 or n == 1:
        return n
    return fib(n - 1) + fib(n - 2)

Calling fib(6) results in the following call structure (where f is fib):

Fibonacci Tree

Each f(i) node represents a recursive call to fib. Each recursive call makes another two recursive calls. f(0) and f(1) do not make any recursive calls because they are the base cases of the function. Because of these base cases, we are able to terminate the recursion and beginning accumulating the values.

Generally, tree recursion is effective when you want to explore multiple possibilities or choices at a single step. In these types of problems, you make a recursive call for each choice or for a group of choices. Here are some examples:

Given a list of paid tasks and a limited amount of time, which tasks should you choose to maximize your pay? This is actually a variation of the Knapsack problem, which focuses on finding some optimal combination of different items.
Suppose you are lost in a maze and see several different paths. How do you find your way out? This is an example of path finding, and is tree recursive because at every step, you could have multiple directions to choose from that could lead out of the maze.
Your dryer costs $2 per cycle and accepts all types of coins. How many different combinations of coins can you create to run the dryer? This is similar to the partitions problem from the textbook.

Required Questions

Mutability Practice

Q1: What Would Python Display?

What would Python display? Type it in the intepreter if you're stuck!
python3 ok -q mutability -u
Note: if nothing would be output by Python, type Nothing.

>>> lst = [5, 6, 7, 8]
>>> lst.append(6)
______
Nothing
>>> lst
______
[5, 6, 7, 8, 6]
>>> lst.insert(0, 9)
>>> lst
______
[9, 5, 6, 7, 8, 6]
>>> x = lst.pop(2)
>>> lst
______
[9, 5, 7, 8, 6]
>>> lst.remove(x)
>>> lst
______
[9, 5, 7, 8]
>>> a, b = lst, lst[:]
>>> a is lst
______
True
>>> b == lst
______
True
>>> b is lst
______
False

>>> pokemon = {'pikachu': 25, 'dragonair': 148, 'mew': 151}
>>> pokemon['pikachu']
______
25
>>> len(pokemon)
______
3
>>> pokemon['jolteon'] = 135
>>> pokemon['mew'] = 25
>>> len(pokemon)
______
4
>>> 'mewtwo' in pokemon
______
False
>>> 'pikachu' in pokemon
______
True
>>> 25 in pokemon
______
False
>>> 148 in pokemon.values()
______
True
>>> 151 in pokemon.keys()
______
False
>>> 'mew' in pokemon.keys()
______
True
>>> pokemon['ditto'] = pokemon['jolteon']
>>> pokemon['ditto']
______
135

City Data Abstraction

Say we have an abstract data type for cities. A city has a name, a latitude coordinate, and a longitude coordinate.

Our ADT has one constructor:

make_city(name, lat, lon): Creates a city object with the given name, latitude, and longitude.

We also have the following selectors in order to get the information for each city:

get_name(city): Returns the city's name
get_lat(city): Returns the city's latitude
get_lon(city): Returns the city's longitude

Here is how we would use the constructor and selectors to create cities and extract their information:

>>> berkeley = make_city('Berkeley', 122, 37)
>>> get_name(berkeley)
'Berkeley'
>>> get_lat(berkeley)
122
>>> new_york = make_city('New York City', 74, 40)
>>> get_lon(new_york)
40

All of the selector and constructor functions can be found in city.py, if you are curious to see how they are implemented. However, the point of data abstraction is that we do not need to know how an abstract data type is implemented, but rather just how we can interact with and use the data type.

Q2: Distance

We will now implement the function distance, which computes the distance between two city objects. Recall that the distance between two coordinate pairs (x1, y1) and (x2, y2) can be found by calculating the sqrt of (x1 - x2)**2 + (y1 - y2)**2. We have already imported sqrt for your convenience. Use the latitude and longitude of a city as its coordinates; you'll need to use the selectors to access this info!

from math import sqrt
def distance(city1, city2):
    """
    >>> city1 = make_city('city1', 0, 1)
    >>> city2 = make_city('city2', 0, 2)
    >>> distance(city1, city2)
    1.0
    >>> city3 = make_city('city3', 6.5, 12)
    >>> city4 = make_city('city4', 2.5, 15)
    >>> distance(city3, city4)
    5.0
    """
    "*** YOUR CODE HERE ***"
    lat_1, lon_1 = get_lat(city1), get_lon(city1)
    lat_2, lon_2 = get_lat(city2), get_lon(city2)
    return sqrt((lat_1 - lat_2)**2 + (lon_1 - lon_2)**2)

Video walkthrough: https://youtu.be/4vB06Ymrico

Use Ok to test your code:

python3 ok -q distance

Q3: Closer city

Next, implement closer_city, a function that takes a latitude, longitude, and two cities, and returns the name of the city that is relatively closer to the provided latitude and longitude.

You may only use the selectors and constructors introduced above and the distance function you just defined for this question.

Hint: How can use your distance function to find the distance between the given location and each of the given cities?

def closer_city(lat, lon, city1, city2):
    """
    Returns the name of either city1 or city2, whichever is closest to
    coordinate (lat, lon).

    >>> berkeley = make_city('Berkeley', 37.87, 112.26)
    >>> stanford = make_city('Stanford', 34.05, 118.25)
    >>> closer_city(38.33, 121.44, berkeley, stanford)
    'Stanford'
    >>> bucharest = make_city('Bucharest', 44.43, 26.10)
    >>> vienna = make_city('Vienna', 48.20, 16.37)
    >>> closer_city(41.29, 174.78, bucharest, vienna)
    'Bucharest'
    """
    "*** YOUR CODE HERE ***"
    new_city = make_city('arb', lat, lon)
    dist1 = distance(city1, new_city)
    dist2 = distance(city2, new_city)
    if dist1 < dist2:
         return get_name(city1)
    return get_name(city2)

Video walkthrough: https://youtu.be/Owwdz_Km87g

Use Ok to test your code:

python3 ok -q closer_city

Q4: Don't violate the abstraction barrier!

When writing functions that use an ADT, we should use the constructor(s) and selector(s) whenever possible instead of assuming the ADT's implementation. Relying on a data abstraction's underlying implementation is known as violating the abstraction barrier, and we never want to do this!

It's possible that you passed the doctests for distance and closer_city even if you violated the abstraction barrier. To check whether or not you did so, uncomment the following lines in your lab04.py file:

# make_city = lambda name, lat, lon: { 'name': name, 'lat': lat, 'lon': lon }
# get_name = lambda city: city['name']
# get_lat = lambda city: city['lat']
# get_lon = lambda city: city['lon']

These statements change the implementation of the city ADT. The nature of the abstraction barrier guarantees that changing the implementation of an ADT shouldn't affect the functionality of any programs that use that ADT, as long as the constructors and selectors were used properly.

Now, rerun your tests for distance and closer_city without changing any of your code:

python3 ok -q distance
python3 ok -q closer_city

If you've followed the rules and used the constructor and selectors when you should've, the doctests should still pass!

If you passed the Ok tests before uncommenting those lines but not afterward, the fix is simple! Just replace any code that violates the abstraction barrier, i.e. creating a city with a new list object or indexing into a city, with the appropriate constructor or selector.

Make sure that your functions pass the tests with both the first and the second implementations of the City ADT and that you understand why they should work for both before moving on.

Optional Questions

More Mutability Practice

Q5: Map

Write a function that takes a function and a list as inputs and maps the function on the given list - that is, it applies the function to every element of the list.

Be sure to mutate the original list. This function should not return anything.

def map(fn, lst):
    """Maps fn onto lst using mutation.
    >>> original_list = [5, -1, 2, 0]
    >>> map(lambda x: x * x, original_list)
    >>> original_list
    [25, 1, 4, 0]
    """
    "*** YOUR CODE HERE ***"
    # Iterative solution
    for i in range(len(lst)):
        lst[i] = fn(lst[i])

# Recursive solution
def map(fn, lst):
    """Maps fn onto lst using mutation.
    >>> original_list = [5, -1, 2, 0]
    >>> map(lambda x: x * x, original_list)
    >>> original_list
    [25, 1, 4, 0]
    """
    if lst:  # True when lst != []
        temp = lst.pop(0)
        map(fn, lst)
        lst.insert(0, fn(temp))

Use Ok to test your code:

python3 ok -q map

Tree Recursion Practice

Q6: Pascal's Triangle

Here's a part of the Pascal's trangle:

Every number in Pascal's triangle is defined as the sum of the item above it and the item that is directly to the upper left of it. Use 0 if the entry is empty.
Define the procedure pascal(row, column) which takes a row and a column, and finds the value at that position in the triangle. Rows and columns are zero-indexed; that is, the first row is row 0 instead of 1.

def pascal(row, column):
    """Returns a number corresponding to the value at that location
    in Pascal's Triangle.
    >>> pascal(0, 0)
    1
    >>> pascal(0, 5)	# Empty entry; outside of Pascal's Triangle
    0
    >>> pascal(3, 2)	# Row 4 (1 3 3 1), 3rd entry
    3
    """

    "*** YOUR CODE HERE ***"
    if column == 0:
    	return 1
    elif row == 0:
    	return 0
    else:
    	return pascal(row - 1, column) + pascal(row - 1, column - 1)

Use Ok to test your code:

python3 ok -q pascal

Q7: Subsequences

A subsequence of a sequence S is a sequence of elements from S, in the same order they appear in S, but possibly with elements missing. Thus, the lists [], [1, 3], [2], and [1, 2, 3] are some (but not all) of the subsequences of [1, 2, 3]. Write a function that takes a list and returns a list of lists, for which each individual list is a subsequence of the original input.

As a first step, write a function inserted_into_all that takes an item and a list of lists, adds the item to the beginning of each smaller/individual list, and returning the resulting big list.

def insert_into_all(item, nested_list):
    """Assuming that nested_list is a list of lists, return a new list
    consisting of all the lists in nested_list, but with item added to
    the front of each.

    >>> nl = [[], [1, 2], [3]]
    >>> insert_into_all(0, nl)
    [[0], [0, 1, 2], [0, 3]]
    """
    "*** YOUR CODE HERE ***"
    return [[item] + lst for lst in nested_list]
def subseqs(s):
    """Assuming that S is a list, return a nested list of all subsequences
    of S (a list of lists). The subsequences can appear in any order.

    >>> seqs = subseqs([1, 2, 3])
    >>> sorted(seqs)
    [[], [1], [1, 2], [1, 2, 3], [1, 3], [2], [2, 3], [3]]
    >>> subseqs([])
    [[]]
    """
    "*** YOUR CODE HERE ***"
    if not s:
        return [[]]
    else:
        subset = subseqs(s[1:])
        return insert_into_all(s[0], subset) + subset

Use Ok to test your code:

python3 ok -q subseqs

Lists Practice

Q8: Merge

Write a function merge that takes 2 sorted lists lst1 and lst2, and returns a new list that contains all the elements in the two lists in sorted order.

def merge(lst1, lst2):
    """Merges two sorted lists.

    >>> merge([1, 3, 5], [2, 4, 6])
    [1, 2, 3, 4, 5, 6]
    >>> merge([], [2, 4, 6])
    [2, 4, 6]
    >>> merge([1, 2, 3], [])
    [1, 2, 3]
    >>> merge([5, 7], [2, 4, 6])
    [2, 4, 5, 6, 7]
    """
    "*** YOUR CODE HERE ***"
    # recursive
    if not lst1 or not lst2:
        return lst1 + lst2
    elif lst1[0] < lst2[0]:
        return [lst1[0]] + merge(lst1[1:], lst2)
    else:
        return [lst2[0]] + merge(lst1, lst2[1:])

# Iterative solution 
def merge_iter(lst1, lst2):
    """Merges two sorted lists.

    >>> merge_iter([1, 3, 5], [2, 4, 6])
    [1, 2, 3, 4, 5, 6]
    >>> merge_iter([], [2, 4, 6])
    [2, 4, 6]
    >>> merge_iter([1, 2, 3], [])
    [1, 2, 3]
    >>> merge_iter([5, 7], [2, 4, 6])
    [2, 4, 5, 6, 7]
    """
    new = []
    while lst1 and lst2:
        if lst1[0] < lst2[0]:
            new += [lst1[0]]
            lst1 = lst1[1:]
        else:
            new += [lst2[0]]
            lst2 = lst2[1:]
    if lst1:
        return new + lst1
    else:
        return new + lst2

Use Ok to test your code:

python3 ok -q merge

Q9: Mergesort

Mergesort is a type of sorting algorithm. It follows a naturally recursive procedure:

Break the input list into equally-sized halves
Recursively sort both halves
Merge the sorted halves.

Using your merge function from the previous question, implement mergesort.

Challenge: Implement mergesort itself iteratively, without using recursion.

def mergesort(seq):
    """Mergesort algorithm.

    >>> mergesort([4, 2, 5, 2, 1])
    [1, 2, 2, 4, 5]
    >>> mergesort([])     # sorting an empty list
    []
    >>> mergesort([1])   # sorting a one-element list
    [1]
    """
    "*** YOUR CODE HERE ***"
    # recursive solution
    if len(seq) < 2:
        return seq
    mid = len(seq) // 2
    return merge(mergesort(seq[:mid]), mergesort(seq[mid:]))

# Iterative solution
def mergesort_iter(seq):
    """Mergesort algorithm.

    >>> mergesort_iter([4, 2, 5, 2, 1])
    [1, 2, 2, 4, 5]
    >>> mergesort_iter([])     # sorting an empty list
    []
    >>> mergesort_iter([1])   # sorting a one-element list
    [1]
    """
    if not seq:
        return []
    queue = [[elem] for elem in seq]
    while len(queue) > 1:
        first, second = queue[0], queue[1]
        queue = queue[2:] + [merge(first, second)]
    return queue[0]

Use Ok to test your code:

python3 ok -q mergesort