Mutation in Python

Ka-Ping Yee, 2005-12-12
In Python, all variables refer to things. When you write x = 3, you are making x refer to 3. You could depict this by drawing an arrow from the name x to the value 3, like this:
Figure 1. After x = 3, the name x refers to the value 3.

Every time you use a variable name, you're telling Python to follow that arrow. For example, to evaluate the expression x + 2, Python follows the arrow from the x to the 3, then adds 3 to 2. Simple enough, right?

When you assign the value of one variable to another variable, they both refer to the same thing. Suppose a refers to the list [3, 4, 5], and then you say b = a:

Figure 2. After b = a, the name b refers to the same list as a does.

There's still only one list; it's just referred to by two different names.

Why does this matter? Well, look at what happens if you now say a.append(1). An extra element is added to the end of the list.

Figure 3. The contents of the list are changed.

Both a and b still refer to the same list, which is now [3, 4, 5, 1].

Tuples

The tuple is another type of collection that's similar to a list. Tuples are written much like lists, but using parentheses instead of square brackets. An empty tuple is written (). A tuple with just one element has to be written with a comma after the element. So (1,) is a one-element tuple; the comma distinguishes it from (1), which is just the number 1. Tuples are also sequences, and they can be indexed and sliced just like lists.

>>> (1, 4, 7, 6, 8)
(1, 4, 7, 6, 8)
>>> t = (1, 4, 7, 6, 8)
>>> len(t)
5
>>> t[3]
6
>>> t[:2]
(1, 4)
>>>

However, you cannot assign to an element of a tuple.

>>> t[3] = 5
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  TypeError: object doesn't support item assignment
>>>

And tuples do not have methods that modify them, like lists do.

>>> t.append(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  AttributeError: 'tuple' object has no attribute 'append'
>>>

That may seem strange. Why bother to have another type of sequence that is strictly less capable than a list? Well, it turns out that sometimes inabilities can be as useful as abilities (perhaps you have encountered this principle in other situations).

Mutation

A mutable object is an object whose contents can change. So far, the only mutable type you have used is the list. When you pass a mutable object into a function, the function can change the object's contents too.

>>> a = [1, 2, 3]
>>> a.append(4)
>>> a
[1, 2, 3, 4]
>>> def spam(x):
...     x.append(5)
...
>>> a
[1, 2, 3, 4]
>>> spam(a)
>>> a
[1, 2, 3, 4, 5]
>>> def edit(x):
...     x[2] = 'fweee!'
...
>>> a
[1, 2, 3, 4, 5]
>>> edit(a)
>>> a
[1, 2, 'fweee!', 4, 5]
>>>

All the other things you've seen so far (numbers, strings, and tuples) are immutable: their contents cannot be changed. Keep in mind that's not the same as reassigning a variable. Of course, you can always reassign a variable to something else. If x contains 3, you can compute x + 1 to get a new number, 4, and assign that to x. But that doesn't change the value of 3. Similarly, the line b = b + 'f' in the following example takes two strings, 'foo' and 'f', and puts them together to make a new string, 'foof'.

>>> a = 3
>>> b = a
>>> b = b + 1
>>> a
3
>>> b
4
>>> a = 'foo'
>>> b = a
>>> b = b + 'f'
>>> a
'foo'
>>> b
'foof'
>>>

You could do the same thing with a tuple or a list:

>>> a = (1, 2, 3)
>>> b = a
>>> b = b + (4,)
>>> a
(1, 2, 3)
>>> b
(1, 2, 3, 4)
>>>
>>> a = [1, 2, 3]
>>> b = a
>>> b = b + [4]
>>> a
[1, 2, 3]
>>> b
[1, 2, 3, 4]
>>>

But, you can also append a 4 to the end of a list using a method:

>>> a = [1, 2, 3]
>>> b = a
>>> b.append(4)
>>> a
[1, 2, 3, 4]
>>> b
[1, 2, 3, 4]
>>>

This last case shows that a and b refer to the same list. Calling a.append(4) tells the list to incorporate 4 as a new element. Saying a = a + [4] adds two lists together, resulting in a new list, which is then reassigned to a. Every time you use square brackets to describe a list, that creates a new list object. And when you add two lists together, that also creates a new list object.

These figures compare the two cases:

   
Figure 4. In the first figure, adding two lists creates a new list and does not mutate anything. In the second figure, appending to a list mutates the list.

Another way to append 4 to the list in b without affecting a is to make a copy of the list first. Whenever you slice a list, you get a new list object. You can get a copy of the entire list by slicing all the way from beginning to end, as in a[:]. (Recall that when you don't specify the starting or ending index for a slice, the slice extends to the beginning or end.)

The tuple doesn't provide the ability to append() at all. Once a tuple has been assigned to a, the only way to change the value of a is to make a refer to something else by reassigning it.

So using a tuple can have an advantage over a list, depending on your purpose: if you pass it into a function, you know for sure that its contents cannot be changed. The tuple you were holding on to is guaranteed to have the same elements after the function is finished.

However, note that if the elements themselves are mutable, the overall value of a tuple may change, even though tuple still refers to the same elements.

>>> a = (1, 2, [3])

>>> a
(1, 2, [3])
>>> a[2].append(4)
>>> a
(1, 2, [3, 4])
>>>

Here's a quick digression about operators: like C and other languages, Python has assignment operators that combine a simple operation and reassignment. So, for example, x += 1 increments x by 1. In general, a += b is like a = a + b, a *= b is like a = a * b, and so on, except:

Equality and Identity

Earlier we talked about modules and functions having namespaces. A namespace is a set of names where each name refers to an object. These references are represented by the arrows in the diagrams above.

After we assign a = b as in the preceding example, a and b refer to the same object (they are two pointers to the same thing). This is called aliasing. Thereafter, as long as neither variable is reassigned, anything we do with a will have an effect visible in b, and vice versa. In fact, the two names become interchangeable; the program would be unaffected if we replace one with the other (again, as long as neither one is reassigned).

Have a look at this.

>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> c = a
>>>
>>> a
[1, 2, 3]
>>> b
[1, 2, 3]
>>> c
[1, 2, 3]
>>> a == b
1
>>> a == c
1
>>>

All right. Now suppose we do a.append(4) at this point.

What will happen to the values of a, b, and c? What will a == b evaluate to? How about a == c? Predict the results you will get and write them down. Then try this out in a Python interpreter and compare the results to your predictions before proceeding.