Lab 7: Recursive Objects

lab07.zip

Due at 11:59pm on Friday, 03/15/2019.

Starter Files

Download lab07.zip. Inside the archive, you will find starter files for the questions in this lab, along with a copy of the Ok autograder.

Submission

By the end of this lab, you should have submitted the lab with python3 ok --submit. You may submit more than once before the deadline; only the final submission will be graded. Check that you have successfully submitted your code on okpy.org.

To receive credit for this lab, you must complete Questions 2 through 6 in lab07.py and submit through OK.
Questions 7 through 8 are extra practice. They can be found in the lab07_extra.py file. It is recommended that you complete these problems on your own time.

Topics

Repr and Str

Created by Xingyu Jin on 03/09/2019

Introduction

In the previous lecture, we are introduced to two ways of getting a string representation of an object: str(...) and repr(...).

As in this note, we will explore the difference of these two methods, how to implement them, and why we need them.

Before we jump into analyzing the detailed difference of the two methods, let's take a look at some examples that we commonly see in python, which use the two methods above.

>>> a = 61
>>> print(a)
61
>>> print(str(a))
61
>>> print(repr(a))
61
>>> s = 'Hello CS 61A!'
>>> print(s)
Hello CS 61A!
>>> print(str(s))
Hello CS 61A!
>>> print(repr(s))
'Hello CS 61A!'

As we can see in the example above, the return values of repr() and str() are identical for an integer, but there's a difference between the return values for a string. In the following section, we will illustrate the difference between the methods with more details.

An Overview Table

	`str()`	`repr()`
Nick-Name	"informal" representation	"official" representation
Goal	Be readable	Be unambiguous
Usage	Get a quick & human-friendly overview of complicated objects	Get the most comprehensive information
Common example	When we call `print(...)`	When we directly return and display a variable in the terminal
`eval()`	Not executable by `eval()`	Can be called as an input to `eval()`. For most object types, `eval(repr(object)) == object`
When to implement	Implement it when readability is important	Should always implement `__repr__`

Example With Doggies! :)

As we can see in the table above, we want to use str() when readability is really important and use repr() when we want to achieve a complete information about an object. To get a better understanding of the two method, it is really important to try them out with solid examples.

Here as an example, let's say we are defining the following Dog class:

class Dog:
    def __init__(self, color, weight):
        self.color = color
        self.weight = weight

puppy = Dog('white', 10)

In the code section above, we created an object of Dog and stored it in a variable puppy. As we may know from the input, we know this puppy is white and weighs 10kg. However, when we try to print or directly display in terminal, Python will give us a quite confusing representation of the cute puppy:

>>> print(puppy)
<__main__.Dog object at 0x7fe36e9472e8>
>>> puppy
<__main__.Dog object at 0x7fe36e9472e8>

By default all you get is a string containing the class name and the ID of the object instance (which is the object’s memory address - don't worry about the details).

As human beings, the representation above is really hard to understand and does not give us much useful information. However, it is what/how the object we created actually stored in the computer, and we can easily tell at at a glance whether or not two things are the same object. So this is better than nothing, but it’s also not very useful for our purpose.

When we are trying to print the object, we want to know some of its attributes (color/weight). Therefore, it is common that we may want to do the following printing statement instead:

>>> print(puppy.color, puppy.weight)
white 10

The underlying idea here is the right and useful one. However, instead of doing the print statement above over and over again, we can take use of conventions and built-in mechanisms Python uses to handle how objects are represented as strings: __str__(self).

How to generalize the idea above?

As we noticed above, we do not want to call the complicated print method over and over to get a more readable representation. Rather, we want to gain useful and readable information once we call print(puppy). In order to do so, we’re going to add a __str__(self) method to the Dog class we defined earlier:

class Dog:
    def __init__(self, color, weight):
        self.color = color
        self.weight = weight

    def __str__(self):
        return 'This is a " + self.color + " dog weighing " + self.weight + " kgs.'

puppy = Dog('white', 10)

>>> print(puppy)
This is a white dog weighing 10 kgs.

Yeah! After defining the __str__(self) method in our Dog class, now when we can print() on a Dog object, it will give us more human-friendly informations. However, will this change the situation where we directly call the variable and display it in terminal?

>>> puppy
>>> <__main__.Dog at 0x7fe36e947da0>

As we see, adding __str__(self) method does not change the representation of calling the object directly. That's because in the previous section, we mentioned that calling a variable directly in terminal will actually evoke the __repr__(self) method. Since we did not give Dog a __repr__(self) method, it will still run the default __repr__(self) method, which display its class name and ID.

Here is a simple and interesting experiement that we can use to get a acutal feeling of when __str__(self) or __repr__(self) will be used.

class Dog:
    def __init__(self, color, weight):
        self.color = color
        self.weight = weight
    def __str__(self):
        return 'Using __str__'
    def __repr__(self):
        return 'Using __repr__'

puppy = Dog('white', 10)

Now, we can try different experiments on this. Feel free to add your one experiement cases for better understanding!

>>> print(puppy)
Using __str__
>>> puppy
Using __repr__
>>> new_s = 'I have a ' + str(puppy)
>>> print(new_s)
I have a Using __str__
>>> new_s
'I have a Using __str__'
>>> new_s = 'I have a ' + puppy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be str, not Dog

The experiments above experiment confirmed most of the conclusions we made in the first part of this notebook.

Interestingly, containers like lists and dicts always use the result of __repr__ to represent the objects they contain. Even if you call str() on the container itself:
>>> puppy1 = Dog('red', 10)
>>> puppy2 = Dog('yellow', 20)
>>> puppy3 = Dog('blue', 30)
>>>
>>> puppyList = [puppy1, puppy2, puppy3]
>>>
>>> puppyList
[Using __repr__, Using __repr__, Using __repr__]
>>> str(puppyList)
'[Using __repr__, Using __repr__, Using __repr__]'

Also, we can manually choose between the __str__(self)__ representation and __repr__(self) representation by simply using the built-in str(...) and repr(...) functions. Since this looks much cleaner than the class method and will output the same result, it is preferable to calling self.__str__() or self.__repr__().

>>> puppy.__str__()
'Using __str__'
>>> str(puppy)
'Using __str__'
>>> puppy.__repr__()
'Using __repr__'
>>> repr(puppy)
'Using __repr__'

Goal of Representation

`str()`

We usually use str() for creating outputs for us/users to read. For example, when we call print(...) on any input, it will invoke the class method __str__(self) and display the string representation of the object, which is typically human-friendly. In other word, the __str__(self) method should be readable.

`repr()`

On the other hand, we usually use repr() to compute the official representation of an object. Usually, such representation will contains all information about the object and is mainly used for debugging and development (mainly for developers). The repr() build-in function will invoke the class method __repr__(self). Typically, when we try to directly display a variable and type it in terminal, the terminal will display us with a repr() representation of the object. In other word, the __repr__(self) method should be unambiguous.

Example

Now, let's take a look at another example with a built-in python class datetime. ⁠

>>> import datetime
>>> today = datetime.datetime(2019, 3, 19, 20, 10)
>>> print(str(today))
2019-03-19 20:10:00
>>> print(repr(today))   
datetime.datetime(2019, 3, 19, 20, 10)

As we can see in the example above, the result of calling str(...) is meant to return a concise textual representation for human, something you’d feel comfortable reading or displaying to a user.

On the other hand, the result of calling repr(...) is more intended to use for developers' purposes and thus it needs to be as clear and explicit as possible on illustrating what the object actually is. As we can see, it not only contains the class name, but also contains which module/package it is in.

Summary

Now let's bring up everything together. In this notebook, we explored the difference between str(...) and repr(...). Notably, str(...) aims at making object more readable, and generate output for users. On the other hand, repr(...) aims at representing the object in an unambiguous way, and generate output for developers.

Linked Lists

We've learned that a Python list is one way to store sequential values. Another type of list is a linked list. A Python list stores all of its elements in a single object, and each element can be accessed by using its index. A linked list, on the other hand, is a recursive object that only stores two things: its first value and a reference to the rest of the list, which is another linked list.

We can implement a class, Link, that represents a linked list object. Each instance of Link has two instance attributes, first and rest.

class Link:
    """A linked list.

    >>> s = Link(1)
    >>> s.first
    1
    >>> s.rest is Link.empty
    True
    >>> s = Link(2, Link(3, Link(4)))
    >>> s.first = 5
    >>> s.rest.first = 6
    >>> s.rest.rest = Link.empty
    >>> s                                    # Displays the contents of repr(s)
    Link(5, Link(6))
    >>> s.rest = Link(7, Link(Link(8, Link(9))))
    >>> s
    Link(5, Link(7, Link(Link(8, Link(9)))))
    >>> print(s)                             # Prints str(s)
    <5 7 <8 9>>
    """
    empty = ()

    def __init__(self, first, rest=empty):
        assert rest is Link.empty or isinstance(rest, Link)
        self.first = first
        self.rest = rest

    def __repr__(self):
        if self.rest is not Link.empty:
            rest_repr = ', ' + repr(self.rest)
        else:
            rest_repr = ''
        return 'Link(' + repr(self.first) + rest_repr + ')'

    def __str__(self):
        string = '<'
        while self.rest is not Link.empty:
            string += str(self.first) + ' '
            self = self.rest
        return string + str(self.first) + '>'

A valid linked list can be one of the following:

An empty linked list (Link.empty)
A Link object containing the first value of the linked list and a reference to the rest of the linked list

What makes a linked list recursive is that the rest attribute of a single Link instance is another linked list! In the big picture, each Link instance stores a single value of the list. When multiple Links are linked together through each instance's rest attribute, an entire sequence is formed.

Note: This definition means that the rest attribute of any Link instance must be either Link.empty or another Link instance! This is enforced in Link.__init__, which raises an AssertionError if the value passed in for rest is neither of these things.

To check if a linked list is empty, compare it against the class attribute Link.empty. For example, the function below prints out whether or not the link it is handed is empty:

def test_empty(link):
    if link is Link.empty:
        print('This linked list is empty!')
    else:
        print('This linked list is not empty!')

Motivation: Why linked lists

Since you are already familiar with Python's built-in lists, you might be wondering why we are teaching you another list representation. There are historical reasons, along with practical reasons. Later in the course, you'll be programming in Scheme, which is a programming language that uses linked lists for almost everything.

For now, let's compare linked lists and Python lists by looking at two common sequence operations: inserting an item and indexing.

Python's built-in list is like a sequence of containers with indices on them:

arraylist

Linked lists are a list of items pointing to their neighbors. Notice that there's no explicit index for each item.

linkedlist

Suppose we want to add an item at the head of the list.

With Python's built-in list, if you want to put an item into the container labeled with index 0, you must move all the items in the list into its neighbor containers to make room for the first item;

arraylist

With a linked list, you tell Python that the neighbor of the new item is the old beginning of the list.

arraylist

We can compare the speed of this operation by timing how long it takes to insert a large number of items to the beginning of both types of lists. Enter the following command in your terminal to test this:

python3 timing.py insert 100000

Now, let's take a look at indexing. Say we want the item at index 3 from a list.

In the built-in list, you can use Python list indexing, e.g. lst[3], to easily get the item at index 3.
In the linked list, you need to start at the first item and repeatedly follow the rest attribute, e.g. link.rest.rest.first. How does this scale if the index you were trying to access was very large?

To test this, enter the following command in your terminal

python3 timing.py index 10000

This program compares the speed of randomly accessing 10,000 items from both a linked list and a built-in Python list (each with length 10,000).

What conclusions can you draw from these tests? Can you think of situations where you would want to use one type of list over another? In this class, we aren't too worried about performance. However, in future computer science courses, you'll learn how to make performance tradeoffs in your programs by choosing your data structures carefully.

Trees (Again)

Recall that a tree is a recursive abstract data type that has a label (the value stored in the root of the tree) and branches (a list of trees directly underneath the root).

We saw one way to implement the tree ADT -- using constructor and selector functions that treat trees as lists. Another, more formal, way to implement the tree ADT is with a class. Here is part of the class definition for Tree, which can be found in lab07.py:

class Tree:
    """
    >>> t = Tree(3, [Tree(2, [Tree(5)]), Tree(4)])
    >>> t.label
    3
    >>> t.branches[0].label
    2
    >>> t.branches[1].is_leaf()
    True
    """
    def __init__(self, label, branches=[]):
        for b in branches:
            assert isinstance(b, Tree)
        self.label = label
        self.branches = list(branches)

    def is_leaf(self):
        return not self.branches

Even though this is a new implementation, everything we know about the tree ADT remains true. That means that solving problems involving trees as objects uses the same techniques that we developed when first studying the tree ADT (e.g. we can still use recursion on the branches!). The main difference, aside from syntax, is that tree objects are mutable.

Here is a summary of the differences between the tree ADT implemented using functions and lists vs. implemented using a class:

-	Tree constructor and selector functions	Tree class
Constructing a tree	To construct a tree given a `label` and a list of `branches`, we call `tree(label, branches)`	To construct a tree object given a `label` and a list of `branches`, we call `Tree(label, branches)` (which calls the `Tree.__init__` method)
Label and branches	To get the label or branches of a tree `t`, we call `label(t)` or `branches(t)` respectively	To get the label or branches of a tree `t`, we access the instance attributes `t.label` or `t.branches` respectively
Mutability	The tree ADT is immutable because we cannot assign values to call expressions	The `label` and `branches` attributes of a `Tree` instance can be reassigned, mutating the tree
Checking if a tree is a leaf	To check whether a tree `t` is a leaf, we call the convenience function `is_leaf(t)`	To check whether a tree `t` is a leaf, we call the bound method `t.is_leaf()`. This method can only be called on `Tree` objects.

Check-Off

Q1: Junk

What happens in the following code? Sign up for checkoffs in your lab if you'd like to get credit for this week's checkoff.

>>> class VendingMachine:
...    v = 0
...    def __init__(self):
...        self.soda = JunkDrink(self)
...
>>> class JunkDrink:
...    v = 0
...    def __init__(self, machine):
...        self.machine = machine
...        self.machine.v = self.machine.v + 1
...        machine.v = machine.v + 1
...        self.v = self.v + 1
...

>>> a = VendingMachine()
>>> a.v
______
2
>>> JunkDrink.v
______
0
>>> a.soda.v
______
1
>>> x = VendingMachine.__init__(a)
>>> x
______
None
>>> a.v
______
4
>>> JunkDrink.v
______
0
>>> a.soda.v
______
1

Required Questions

What Would Python Display?

Q2: WWPD: Linked Lists

Read over the Link class in lab07.py. Make sure you understand the doctests.

Use Ok to test your knowledge with the following "What Would Python Display?" questions:
python3 ok -q link -u
Enter Function if you believe the answer is <function ...>, Error if it errors, and Nothing if nothing is displayed.

If you get stuck, try drawing out the box-and-pointer diagram for the linked list on a piece of paper or loading the Link class into the interpreter with python3 -i lab07.py.

>>> from lab07 import *
>>> link = Link(1000)
>>> link.first
______
1000
>>> link.rest is Link.empty
______
True
>>> link = Link(1000, 2000)
______
AssertionError
>>> link = Link(1000, Link())
______
TypeError

>>> from lab07 import *
>>> link = Link(1, Link(2, Link(3)))
>>> link.first
______
1
>>> link.rest.first
______
2
>>> link.rest.rest.rest is Link.empty
______
True
>>> link.first = 9001
>>> link.first
______
9001
>>> link.rest = link.rest.rest
>>> link.rest.first
______
3
>>> link = Link(1)
>>> link.rest = link
>>> link.rest.rest.rest.rest.first
______
1
>>> link = Link(2, Link(3, Link(4)))
>>> link2 = Link(1, link)
>>> link2.first
______
1
>>> link2.rest.first
______
2

>>> from lab07 import *
>>> link = Link(5, Link(6, Link(7)))
>>> link                  # Look at the __repr__ method of Link
______
Link(5, Link(6, Link(7)))
>>> print(link)          # Look at the __str__ method of Link
______
<5 6 7>

Q3: WWPD: repr and str

Use Ok to test your knowledge with the following "What Would Python Display?" questions:
python3 ok -q repr_str -u
If you get stuck, try running the examples on the interpreter.

>>> print("hi")
______
hi
>>> "hi"
______
'hi'
>>> print(repr("hi"))
______
'hi'
>>> repr("hi")
______
"'hi'"

>>> class A:
...     def __init__(self, x):
...         self.x = x
...     def __repr__(self):
...         return self.x
>>> class B(A):
...     def __str__(self):
...         return self.x + self.x
>>> A("hi")
______
hi
>>> print(A("hi"))
______
hi
>>> B("hi")
______
hi
>>> print(B("hi"))
______
hihi

>>> class C:
...     def __str__(self):
...         print('hi')
...         return 'hihi'
...     def __repr__(self):
...         print('hihihi')
...         return 'hihihihi'
>>> C()
______
hihihi
hihihihi
>>> print(C())
______
hi
hihi
>>> q = str(C())
______
hi
>>> q
______
'hihi'
>>> r = repr(C())
______
hihihi
>>> r
______
'hihihihi'

Coding Practice

Q4: Link to List

Write a function link_to_list that takes in a linked list and returns the sequence as a Python list. You may assume that the input list is shallow; none of the elements is another linked list.

Try to find both an iterative and recursive solution for this problem!

def link_to_list(link):
    """Takes a linked list and returns a Python list with the same elements.

    >>> link = Link(1, Link(2, Link(3, Link(4))))
    >>> link_to_list(link)
    [1, 2, 3, 4]
    >>> link_to_list(Link.empty)
    []
    """
    "*** YOUR CODE HERE ***"
    # Recursive solution
    if link is Link.empty:
        return []
    return [link.first] + link_to_list(link.rest)

# Iterative solution
def link_to_list(link):
    result = []
    while link is not Link.empty:
        result.append(link.first)
        link = link.rest
    return result

    # Video Walkthrough: https://youtu.be/hdO9Ry8d5FU?t=25m6s

Use Ok to test your code:

python3 ok -q link_to_list

Q5: Store Digits

Write a function store_digits that takes in an integer n and returns a linked list where each element of the list is a digit of n.

def store_digits(n):
    """Stores the digits of a positive number n in a linked list.

    >>> s = store_digits(1)
    >>> s
    Link(1)
    >>> store_digits(2345)
    Link(2, Link(3, Link(4, Link(5))))
    >>> store_digits(876)
    Link(8, Link(7, Link(6)))
    """
    "*** YOUR CODE HERE ***"
    result = Link.empty
    while n > 0:
        result = Link(n % 10, result)
        n //= 10
    return result

Use Ok to test your code:

python3 ok -q store_digits

Q6: Cumulative Sum

Write a function cumulative_sum that mutates the Tree t so that each node's label becomes the sum of all labels in the subtree rooted at the node.

def cumulative_sum(t):
    """Mutates t so that each node's label becomes the sum of all labels in
    the corresponding subtree rooted at t.

    >>> t = Tree(1, [Tree(3, [Tree(5)]), Tree(7)])
    >>> cumulative_sum(t)
    >>> t
    Tree(16, [Tree(8, [Tree(5)]), Tree(7)])
    """
    "*** YOUR CODE HERE ***"
    for b in t.branches:
        cumulative_sum(b)
    t.label = sum([b.label for b in t.branches]) + t.label

Use Ok to test your code:

python3 ok -q cumulative_sum

Optional Questions

The following questions are for extra practice -- they can be found in the the lab07_extra.py file. It is recommended that you complete these problems on your own time.

Q7: Cycles

The Link class can represent lists with cycles. That is, a list may contain itself as a sublist.

>>> s = Link(1, Link(2, Link(3)))
>>> s.rest.rest.rest = s
>>> s.rest.rest.rest.rest.rest.first
3

Implement has_cycle,that returns whether its argument, a Link instance, contains a cycle.

Hint: Iterate through the linked list and try keeping track of which Link objects you've already seen.

def has_cycle(link):
    """Return whether link contains a cycle.

    >>> s = Link(1, Link(2, Link(3)))
    >>> s.rest.rest.rest = s
    >>> has_cycle(s)
    True
    >>> t = Link(1, Link(2, Link(3)))
    >>> has_cycle(t)
    False
    >>> u = Link(2, Link(2, Link(2)))
    >>> has_cycle(u)
    False
    """
    "*** YOUR CODE HERE ***"
    links = []
    while link is not Link.empty:
        if link in links:
            return True
        links.append(link)
        link = link.rest
    return False

Use Ok to test your code:

python3 ok -q has_cycle

As an extra challenge, implement has_cycle_constant with only constant space. (If you followed the hint above, you will use linear space.) The solution is short (less than 20 lines of code), but requires a clever idea. Try to discover the solution yourself before asking around:

def has_cycle_constant(link):
    """Return whether link contains a cycle.

    >>> s = Link(1, Link(2, Link(3)))
    >>> s.rest.rest.rest = s
    >>> has_cycle_constant(s)
    True
    >>> t = Link(1, Link(2, Link(3)))
    >>> has_cycle_constant(t)
    False
    """
    "*** YOUR CODE HERE ***"
    if link is Link.empty:
        return False
    slow, fast = link, link.rest
    while fast is not Link.empty:
        if fast.rest == Link.empty:
            return False
        elif fast is slow or fast.rest is slow:
            return True
        else:
            slow, fast = slow.rest, fast.rest.rest
    return False

Use Ok to test your code:

python3 ok -q has_cycle_constant

Q8: Reverse Other

Write a function reverse_other that mutates the tree such that labels on every other (odd-depth) level are reversed. For example, Tree(1,[Tree(2, [Tree(4)]), Tree(3)]) becomes Tree(1,[Tree(3, [Tree(4)]), Tree(2)]). Notice that the nodes themselves are not reversed; only the labels are.

def reverse_other(t):
    """Mutates the tree such that nodes on every other (odd-depth) level
    have the labels of their branches all reversed.

    >>> t = Tree(1, [Tree(2), Tree(3), Tree(4)])
    >>> reverse_other(t)
    >>> t
    Tree(1, [Tree(4), Tree(3), Tree(2)])
    >>> t = Tree(1, [Tree(2, [Tree(3, [Tree(4), Tree(5)]), Tree(6, [Tree(7)])]), Tree(8)])
    >>> reverse_other(t)
    >>> t
    Tree(1, [Tree(8, [Tree(3, [Tree(5), Tree(4)]), Tree(6, [Tree(7)])]), Tree(2)])
    """
    "*** YOUR CODE HERE ***"
    def reverse_helper(t, need_reverse):
        if t.is_leaf():
            return
        new_labs = [child.label for child in t.branches][::-1]
        for i in range(len(t.branches)):
            child = t.branches[i]
            reverse_helper(child, not need_reverse)
            if need_reverse:
                child.label = new_labs[i]
    reverse_helper(t, True)

Use Ok to test your code:

python3 ok -q reverse_other