CS61A Lab 9: Mutable Lists and Dictionaries

Week 5A, 2012

Debugging practice

Whenever Python encouters a bug in your code, it will print a list of error messages. This printed statement is known as a traceback. A traceback might look scary to the untrained eye, but it is actually a very useful debugging tool!

A sample traceback looks like this:

Traceback (most recent call last):
  File "bug1.py", line 22, in func
      g = bug()
  File "bug2.py", line 3, in bug
      buggy = buggy + 1
NameError: name 'buggy' is not defined

A traceback displays all the function calls that led up to the error. Notice that the message tells you that the "most recent call" is displayed "last", at the bottom of the message.

You may have noticed that the traceback is organized in "couplets." Each couplet is composed of two lines: the location; and the actual code

The location is the first line in the couplet:

  File "file name", line number, in function name

This line provides you with three bits of information:

file name: the name of the file where this function call is located
line number: the line number in the file where the error is located
function name: the name of the function that houses the error

From this line alone, you should be able to locate the line where the error occurs, in case you need reference it in more detail.

The second line in the couplet is the code line:

    line of code here

This line allows you to see the piece of code that triggered the error just by looking at the traceback. That way, you can quickly reference the code and guess why the error occured.

The very last line in a traceback is the error type:

    Error type: Error message

The error type is usually named to be descriptive (such as SyntaxError or IndexError). The error message is a more detailed (but still brief) description of the error.

Now that you know how to read a traceback, you are well-equipped to handle any bugs that come your way. The following are some common bugs that occur frequently (many of you might have seen them before). Try to debug the code!

You can copy the starter code (for the debugging and list portions of the lab) into your current directory to produce the traceback.

cp ~cs61a/lib/lab/lab09/lab9.py .

Feel free to do so, but copying and pasting from a webpage doesn't always yield the results you would expect!

def one(mississippi):
    """
    >>> one(1)
    1
    >>> one(2)
    4
    >>> one(3)
    6
    """
    if mississippi == 1:
        return misssissippi
    elif mississippi == 2:
        return 2 + mississippi
    elif mississippi = 3:
        return 3 + mississippi

def two(tup):
    """
    >>> two((1, 2, 3, 4))
    ((((1, 2), 1), (2, 3), 2), ((3, 4), 1))
    """
    first = (tup[0], tup[1])
    second = (tup[1], tup[2])
    third = (tup[2], tup[3])
    return (((first, tup[0]), second, tup[1]), (third, (tup[0]))))

def three(word):
    """
    >>> three('hello world!')
    'hello  w o r l d '
    """
    i = 0
    while len(word) > i:
        new = new + word[i]
        if len(new) > len(word) // 2:
            new = new + ' '
    return new

def four(seq):
    """
    >>> four((1, 2, 3, 4, 5))
    [1, 4, 9, 16, 25]
    """
    newseq = []
    for i in seq:
        newseq[i] = seq[i]**2
    return newseq

def one(mississippi):
    if mississippi == 1:
        return misssissippi           # extra 's' in mississippi
    elif mississippi == 2:
        return 2 + mississippi
    elif mississippi = 3:              # should be ==, not =
        return 3 + mississippi

def two(tup):
    first = (tup[0], tup[1])
    second = (tup[1], tup[2])
    third = (tup[2], tup[3])
    return (((first, tup[0]), second, tup[1]), (third, (tup[0]))))
                                      # extra parenthesis at end ^

def three(word):
    i = 0
    while len(word) > i:              # forgot to increment i
        new = new + word[i]           # forgot to initialize new
        if len(new) > len(word) // 2:
            new = new + ' '
    return new

def four(seq):
    newseq = []
    for i in seq:                     # i is an element in seq, not the index
        newseq[i] = seq[i]**2         # should be appending to newseq
    return newseq

Finally, you should almost always run the doctests we provide you from the terminal, instead of copying and pasting the tests into an interpreter. This avoids typing mistakes, is a lot faster, and achieves the same result.

Intro to lists

Previously, we had dealt with tuples, which are immutable sequences. Python has builtin Lists that are mutable. This means you can modify lists without creating entirely new ones. Lists have state, unlike tuples.

Just like with tuples, you can use slicing notation with lists. In addition, not only can retrieve a slice from a list, you can also assign to a slice of a list. This is possible because lists are mutable.

Question 1.

What does Python print? Think about these before typing it into an interpreter!

>>> lst = [1, 2, 3, 4, 5, 6]
>>> lst[4] = 1
>>> lst
_________
>>> lst[2:4] = [9, 8]
>>> lst
_________
>>> lst[3] = ['hi', 'bye']
>>> lst
_________
>>> lst[3:] = ['jom', 'magrotker']
>>> lst
_________
>>> lst[1:3] = [2, 3, 4, 5, 6, 7, 8]
>>> lst
_________
>>> lst == lst[:]
_________
>>> lst is lst[:]
_________
>>> a = lst[:]
>>> a[0] = 'oogly'
>>> lst
_________
>>> lst = [1, 2, 3, 4]
>>> b = ['foo', 'bar']
>>> lst[0] = b
>>> lst
_________
>>> b[1] = 'ply'
>>> lst
_________
>>> b = ['farply', 'garply']
>>> lst
_________
>>> lst[0] = lst
>>> lst
_________
>>> lst[0][0][0][0][0]
_________

1)  [1, 2, 3, 4, 1, 6]
2)  [1, 2, 9, 8, 1, 6]
3)  [1, 2, 9, ['hi', 'bye'], 1, 6]
4)  [1, 2, 9, 'jom', 'magrotker']
5)  [1, 2, 3, 4, 5, 6, 7, 8, 'jom', 'magrotker']
6)  True
7)  False
8)  [1, 2, 3, 4, 5, 6, 7, 8, 'jom', 'magrotker']
9)  [['foo', 'bar'], 2, 3, 4]
10) [['foo', 'ply'], 2, 3, 4]
11) [['foo', 'ply'], 2, 3, 4]
12) [[...], 2, 3, 4]
13) [[...], 2, 3, 4]

List methods

Python has a list class that contains many useful methods. Using the builtin dir() function will show you all of them, like so:

dir(list)

Some of the most common methods include append(), extend(), and pop().

>>> l = [3, 5, 6]
>>> l.append(10) # adds an element to the end
>>> l
[3, 5, 6, 10]
>>> l.extend([-1, -6]) # concatenates another list to the end
>>> l
[3, 5, 6, 10, -1, -6]
>>> l.pop() # removes and returns the last element
-6
>>> l
[3, 5, 6, 10, -1]
>>> l.pop(2) # removes and returns the element at the index given
6
>>> l
[3, 5, 10, -1]

Try to solve the following list problems with mutation. This means that each function should mutate the original list. In other words:

>>> original_list = [5, -1, 29, 0]
>>> function(original_list) # doesn't return anything
>>> original_list
# mutated list here

Prioritize solving these problems with iteration, but for extra practice, also solve them using recursion. Remember: these functions should NOT return anything. This is to emphasize that these functions should utilize mutability.

Question 2.

Write a function that reverses the given list.

def reverse(lst):
    """Reverses lst using mutation.
    >>> original_list = [5, -1, 29, 0]
    >>> reverse(original_list)
    >>> original_list
    [0, 29, -1, 5]
    """

Iterative---
def reverse(lst):
    midpoint = len(lst) // 2
    last = len(lst) - 1
    for i in range(midpoint):
        lst[i], lst[last - i] = lst[last - i], lst[i]
                
Recursive1---
def reverse(lst):
    if len(lst) > 1:
        temp = lst.pop()
        reverse(lst)
        lst.insert(0, temp)

Recursive2---
def reverse(lst):
    midpoint = len(lst) // 2
    last = len(lst) - 1
    def helper(i):
        if i == midpoint:
            return
        lst[i], lst[last - i] = lst[last - i], lst[i]
        helper(i + 1)
    helper(0)

Question 3.

Write a function that maps a function on the given list.

def map(fn, lst):
    """Maps fn onto lst using mutation.
    >>> original_list = [5, -1, 2, 0]
    >>> map(lambda x: x * x, original_list)
    >>> original_list
    [25, 1, 4, 0]
    """

Iterative---
def map(fn, lst):
    for i in range(len(lst)):
        lst[i] = fn(lst[i])
        
Recursive---
def map(fn, lst):
    if lst: # True when lst != []
        temp = lst.pop(0)
        map(fn, lst)
        lst.insert(0, fn(temp))

Question 4.

Write a function that filters a list, only keeping elements that satisfy the predicate.

def filter(pred, lst):
    """Filters lst with pred using mutation.
    >>> original_list = [5, -1, 2, 0]
    >>> filter(lambda x: x % 2 == 0, original_list)
    >>> original_list
    [2, 0]
    """

Iterative---
def filter(pred, lst):
    i = len(lst) - 1
    while i >= 0:
        if not pred(lst[i]):
            lst.pop(i)
        i -= 1
        
Recursive---
def filter(pred, lst):
    if lst: 
        temp = lst.pop(0)
        filter(pred, lst)
        if pred(temp):
            lst.insert(0, temp)

List Comprehensions

Just as with tuples, you can use generator expressions with lists. For example,

>>> [i**2 for i in (1, 2, 3, 4) if i%2 == 0]
[4, 16]

is equivalent to

>>> lst = []
>>> for i in (1, 2, 3, 4):
...     if i % 2 == 0:
...         lst.append(i**2)
>>> lst
[4, 16]

List comprehensions allow you to apply map and filter at the same time, in very compact syntax. The general syntax for a list comprehension is

[expression for element in sequence if conditional]

The syntax is designed to read like English: "Compute the expression for each element in the sequence if the conditional is true."

Question 5.

Implement a function coords, which takes a funciton, a sequence, and an upper and lower bound on output of the function. coords then returns a list of x, y coordinate pairs (tuples) such that:

Each pair contains (x, fn(x))
The x coordinates can only be elements in the sequence
Only pairs whose y coordinate is within the upper and lower bounds are included

See the doctest if you are still confused.

One other thing: your answer can only be one line long. You should make use of list comprehensions!

def coords(fn, seq, lower, upper):
    """
    >>> seq = (-4, -2, 0, 1, 3)
    >>> fn = lambda x: x**2
    >>> coords(fn, seq, 1, 9)
    [(-2, 4), (1, 1), (3, 9)]
    """ 
    return _______________

def coords(fn, seq, lower, upper):
    return [(x, fn(x)) for x in seq if lower <= fn(x) <= upper]

Question 6.

(Reinforcement - More Challenging) Recall the capitalize function from Homework 5: given a list of strings (which are words), capitalize each string if the preceding string ends in a period ('.'). Implement the function, but this time use list comprehensions.

Once again, your answer should be one line long, and should utilize list comprehensions (do NOT use the code in the homework 5 solution).

def capitalize(L):
    """
    >>> L = "hi there. my name is jom. what is your name?".split()
    >>> capitalize(L)
    ['Hi', 'there.', 'My', 'name', 'is', 'jom.', 'What', 'is', 'your', 'name?']
    >>> " ".join(capitalize(L))
    'Hi there. My name is jom. What is your name?'
    """    
    return ________________

def capitalize(L):
    return [word.capitalize() if word is L[0] or L[i-1].endswith('.') else word for i, word in enumerate(L)]

Dictionaries and Shakespeare

Next, let's talk about dictionaries. Builtin Python dictionaries are simply unordered sets of key-value pairs. To create a dictionary, use the following syntax:

>>> names = {'Tom': 'Magrino', 'Jon': 'Kotker', 'Neil': ('Patrick', 'Harris')}

The curly braces denote the key-value pairs in your dictionary. Each key-value pair is separated by a comma, and for each pair, the key appears to the left of the colon and the value appears to the right of the colon. You can retrieve values from your dictionary by using the key:

>>> names['Tom']
'Magrino'
>>> names['Jon']
'Kotker'
>>> names['Neil']
('Patrick', 'Harris')

To modify the entry for an existing key in the dictionary, use the following syntax. Adding a new key follows the identical syntax!

>>> names['Tom'] = 'Hardy'
>>> names['Tom']
'Hardy'
>>> names['Neil'] = names['Jon']
>>> names['Neil']
'Kotker'

Question 7.

Now that you know how dictionaries work, we can move on to approximating the entire works of Shakespeare! We're going to use a bigram language model. Here's the idea: we start with some word - we'll use "The" as an example. Then we look through all of the texts of Shakespeare and for every instance of "The" we record the word that follows "The" and add it to a list. Then we randomly choose a word from this list, say "cat", and repeat the process. This eventually will terminate in a period (".") and we will have generated a Shakespearean sentence!

The object that we'll be looking things up in is called a "successor table," although it's really just a dictionary. The keys in this dictionary are words, and the values are lists of successors to those words.

(A copy of the framework code is locaed in ~cs61a/lib/shakespeare.py)

Here's an incomplete definition of the build_successors_table function. The input is a list of words (corresponding to a text), and the output is a successors table. (By default, the first word is a successor to '.') See this example:

def build_successors_table(tokens):
    table = {}
    prev = '.'
    for word in tokens:
        if prev in table:
            "*** YOUR CODE HERE ***"
        else:
            "*** YOUR CODE HERE ***"
        prev = word
    return table

>>> text = 'The', 'cat', 'and', 'the', 'dog', 'both', 'ate', 'and', 'ran', 'outside', '.'
>>> table = build_sucessors_table(text)
>>> table
{'and': ['the', 'ran'], 'both': ['ate'], 'ate': ['and'], 'ran': ['outside'], 'dog': ['both'], 'cat': ['and'], 'outside': ['.'], 'The': ['cat'], '.': ['The'], 'the': ['dog']}

def build_successors_table(tokens):
    """Return a dictionary: keys are words; values are tuples of successors."""
    table = {}
    prev = '.'
    for word in tokens:
        if word in table:
            table[prev] += [word]
        else:
            table[prev] = [word]
        prev = word
    return table

Question 8.

Let's generate some sentences! Suppose we're given a starting word. We can look up this word in our table to find its list of successors, and then randomly select a word from this list to be the next word in the sentence. Then we just repeat until we reach some ending punctuation. (Note: to randomly select from a list, first make sure you import the Python random library with import random and then use the expression random.choice(my_list)). Here's a definition of construct_sent to get you started.

def construct_sent(word, table):
    import random
    result = ''
    while word not in ['.', '!', '?']:
        "*** YOUR CODE HERE ***"
    return result + word

def construct_sent(word, table):
    """Prints a random sentence starting with word, sampling from table"""
    import random
    result = ' '
    while word not in ['.', '!', '?']:
        successor_wds = table[word]
        next_wd = random.choice(successor_wds)
        result = result + ' ' + word
        word = next_wd
    return result + word

Question 9.

Great! Now all that's left is to run our functions with some actual code. The following snippet will return a list containing the words in all the works of Shakespeare. WARNING: do NOT try to print the return result of this function):

def shakespeare_tokens(path='shakespeare.txt', url='http://inst.eecs.berkeley.edu/~cs61a/lib/shakespeare.txt'):
    """Return the words of Shakespeare's plays as a list"""
    import os
    from urllib.request import urlopen
    if os.path.exists(path):
        return open(path, encoding='ascii').read().split()
    else:
        shakespeare = urlopen(url)
        return shakespeare.read().decode(encoding='ascii').split()[:2000]

Next, we probably want an easy way to refer to our list of tokens and our successors able. Let's make the following assignments:

>>> tokens = shakespeare_tokens()
>>> table = build_successors_table(tokens)

Finally, let's define an easy to call utility function and use it to generate Shakespeare sentences:

>>> def sent()
...     return construct_sent('The', table)
>>> sent()
" The plebeians have done us must be news-cramm'd"
>>> sent()
" The bird of Tunis , or two white and plucker down with better ; that's God's sake"

Huzzah!