University of California, Berkeley
EECS Department - Computer Science Division
CS3 Lecture 24
Pairs, Lists, and Box-and-Pointer Notation
Guest Lecturer: Anjna Mehta
(many thanks to Chris Hirsch for these excellent
notes! -Dan)
Overview of today's lecture
Review
Recursion Potpourri
- We saw some more wonderful examples of recursion in different
contexts.
Representing the Real World
Overview
- Most computer programs are "about" something.
- They are used to model and describe the real world.
- Yet, the programs themselves do not manipulate actual objects
they can only manipulate representations of such things.
- There are two classes of data objects: simple and compound.
Data Objects
- Real world objects, ideas, and the way they behave
Data Types
- A means of representing and implementing data objects
Simple
- those data objects that are atomic
- having no component parts
- indivisible
- primitive
- For example:
- numbers
- booleans
- months
- ranks, suits
Compound
- those data objects that are molecular
- having component parts
- divisible
- complex
- For example,
- words (> letters)
- sentences (> words)
- lists (> pairs)
- dates (> days, months, years)
- databases (> employee-names, job-titles, dates-appointed,
salaries)
- layouts (> stacks > cards > ranks, suits)
Abstraction
- The behavior of our computer programs should not be dictated
by our choice of implementation.
- Behavior should be implementation independent.
- Implementation choices should be governed by matters of efficiency
(solving time and memory limitations).
- Abstraction is a way to separate behavior from implementation.
- It is a means of respecting the distinction between data
objects and data types.
- Constructors: tools for forming compound data objects
- Selectors: tools for selecting component parts from
compound data objects
- Interaction between the levels of behavior and implementation
should be mediated only by constructors and selectors
Dotted Pairs (Cons Pairs)
Overview
- Representing compound data objects requires a means of gluing
together the component parts.
- Scheme,s most basic "glue" is the dotted pair.
- A pair is simply the gluing together of two things to form
one bigger thing.
- Pairs can also be hooked together to form more pairs.
- The constructor for pairs is cons.
- The selectors are car and cdr.
: (define cons-1-2 (cons 1 2))
==> cons-1-2
: cons-1-2
==> (1 . 2) ;; How dotted pairs are printed
;; when the second arg is not a list.
;; Don't worry about the dot, you won't
;; be required to know this detail.
: (car cons-1-2)
==> 1
: (cdr cons-1-2)
==> 2
- You may notice that the second argument to cons
is not a list
- Aside from this particular example (cons 1 2),
you will never see the second arg be anything but a list. (And
thus, never have to worry about the printed representation of
dotted pairs as shown above)
Lists
Overview
- Pairs can also be used to build sequences, ordered collections
of data objects.
- We can easily represent sequences as a chain of pairs.
- The car of each pair is the corresponding item in
the chain, while the cdr points to the next pair.
- The cdr of the final pair signals the end of the
sequence, (represented in box-and-pointer notation as a diagonal
line and in programs as the value of the variable nil.
- The value of nil is a sequence of no elements. I.e.
the null list: ()
- We use the same constructors and selectors for lists as we
do with pairs (cons, car, cdr), even though this is
really a data abstraction violation.
: (define list-1-2-3 (cons 1 (cons 2 (cons 3 nil))))
==> list-1-2-3
: list-1-2-3
==> (1 2 3)
Box-and-Pointer Notation
Overview
- It is a means for representing pairs and lists.
- Each object (whether simple or compound) is shown as a pointer
to a box.
- A simple object is just represented as itself.
- A pair is represented as a double box (a rectangle), with
the left part containing a pointer to the car of the
pair and the right part containing a pointer to the cdr.
- Box-and-pointer representation for the pair cons-1-2:
(cons 1 2)
- Box-and-pointer representation for the list list-1-2-3:
(cons 1 (cons 2 (cons 3 nil)))
Rules
- An arrow cannot point to half of a pair. If an arrowhead
touches a pair, it is pointing to the entire pair. It
does not matter where the arrowhead touches the pair.
- Let's look at a common error students make:
(define one (car list-1-2-3))
- The arrow for x should point to the thing that the
car of list-1-2-3 points to, not to the left
half of the list-1-2-3 rectangle
- The direction of pointers (up, down, left, right, diagonal,
etc.) does not matter. You may draw them however you want
in order to make the pairs as neat as possible. This is
why the arrowheads are so important for figuring out direction.
- Both are perfectly valid diagrams for list-1-2-3
- There must be a top-level arrow to show where the structure
you are representing begins.
- This is very BAD because we don't know which is the first
pair!
Where to Begin
- Given a complicated list, first work out the backbone.
- To do so, begin by determining how many elements are in the
list (let's call that number x).
- Then draw an x-pair backbone, x pairs with the cdr
of one pointing to the next one (the final cdr is null).
- Next, work out the individual pairs starting from the inside-out
(just like the evaluator does)
: (cons (cons 1 nil) (cons 2 nil))

: (cons (cons 1 (cons 2 nil)) nil)

Summary
- Often times our computer programs are attempts to characterize
the real world; data objects (both simple and compound) must
then be implemented in terms of data types.
- Abstraction is a way of interacting with computer programs
such that ideas about data objects and their behavior are kept
distinct from the way in which they are implemented.
- Pairs provide a universal building block from which we can
construct all sorts of data structures.
- Lists are a more powerful data type that allow for the representation
of ordered data objects; lists are implemented using pairs.
- Box-and-pointer notation is a convenient means for representing
both pairs and lists.