Concurrency

In the real world, things don't happen one at a time. Even on the computer, you probably have several tabs open right now, with music playing in the background, and maybe some other homework open as you read this.

Computers deal with this by dividing their time among the different processes on the system. Naturally there are some issues to deal with—if you can stop in the middle of any procedure, you need to make sure you can pick up where you left off! And if you have a computer with multiple processors or multiple cores (which is quite likely these days), your computer can really be running two processes' instructions at the same time, instead of just faking it by switching back and forth.

This is a long review sheet, so here are the terms you ought to recognize, in advance:

The example of a concurrency problem I gave in section was the story about Wheat Thins, where you and your roommate both see that you are out, and both end up going to the store.

Another example concerns collaborative content, such as source code in a version-tracking repository, or more simply, two people editing the same Wikipedia page. If one of them sends their changes before the other is done, the second person might overwrite what was there, rather than taking it into account. Problem!

In this class, we talk about three levels of protection tools: test-and-set!, mutexes, and serializers. Let's use each of them to try to solve the Wikipedia problem.

We already know we can't use test-and-set! to modify a Wikipedia page—it only sets things to #t. Let's be a little more formal there. Here's what test-and-set! does:

(car cell) before the call (car cell) after the call Return value
#f #t #f
#t #t #t

You can see after calling test-and-set!, the variable is guaranteed to be #t. (Do you see why?) The return value is the old value of (car cell)

; from SICP, p. 312
(define (test-and-set! cell)
  (if (car cell)
    #t
    (begin (set-car! cell #t)
           #f)))

The book's implementation uses a pair to hold the single value to avoid making test-and-set! a special form; the cdr is ignored. (This isn't really important, but in case anyone was curious.)

The last question used the word "incorrect". Let's take a moment to define what correct means: a correct result from running several things in parallel is a result you could get by running them in some order in serial (one after another).

(parallel-execute (lambda () (set! x 0))
                  (lambda () (set! x 1)) )

Both 1 and 0 are "correct" values for x!

Let's apply test-and-set! to our problem: how about each page on Wikipedia has a flag (either #t or #f) that indicates whether or not someone is editing? Then an edit might look something like this:

(define (edit page)
  (if (test-and-set! (get-cell page))
      (begin (wait-a-bit)  ; don't worry how this works
             (edit page) ) ; try again
      (begin (display-page page)
             (set-contents! page (read))
             (clear! (get-cell page)) )))

If someone else was already editing, we wait a bit and see if they've finished. Otherwise, if the page is clear, we've set the editing flag ourself, and can edit the page safely. We do have to remember to clear the editing flag when we're done.

This "wait until available" pattern turns out to be a convenient one: let's build another useful ADT for it. A mutex (or lock) is a data type that is either held or free. If someone holds the mutex, no one else can acquire it until the holder releases it.

Since we are guaranteed that only one process can acquire the mutex at once, we can rewrite our Wikipedia-editing procedure to be a little more straightforward.

(define (edit page)
  ((get-mutex page) 'acquire) ; will wait until we actually hold the mutex
  (display-page page)
  (set-contents! page (read))
  ((get-mutex page) 'release) )

Much easier to see what's going on this way!

But of course, we still have the serialization code inside the methed. Every operation on this page will use the same mutex. That's why we have one more level of abstraction: serializers. While mutexes protect a critical section of code, serializers protect entire procedures.

How do we use serializers? A serializer is a procedure that takes a procedure and returns a protected version of that procedure. Any protected procedures created using the same serializer are guaranteed not to run at the same time!

If we're using serializers to implement our code, things get really simple.

(define (edit page)
  (((get-serializer page)
    (lambda ()
      (display-page page)
      (set-contents! page (read)) ))))

That is the correct number of parens: we're getting the serializer, then using it to protect our actual work, then calling the protected procedure that the serializer just created.

Even better, we can reuse this serializer for any other kinds of edits to the page, like deleting it, or moving it to another section of the wiki. We can even save protected versions of procedures, and only invoke the serializer once:

(define (make-editor page)
  ((get-serializer page)
   (lambda ()
     (display-page page)
     (set-contents! page (read)) )))
     
(define edit-homepage (make-editor homepage))

edit-homepage is now a procedure of no arguments that edits the Wikipedia homepage in a safe way; it waits for anyone else to finish editing first. There is no chance of you seeing an old version of the page; the page shown by display-page is guaranteed to be up-to-date—which is what we were after all along.

As a final note, it's worth pointing out that Wikipedia doesn't actually do this! Instead, they keep track of people's changes, and if multiple edits happen, they try to merge them. If the merge doesn't work out, hopefully one of the editors is notified to complete the merge by hand. But there are real systems that work like this one, where only one person can access some data at a time.

Whew! That was quite a lot of material. Hopefully that helped some; if not, take a look again at the lecture notes, the book, and perhaps some other TA's notes. And, as always, feel free to send me e-mail, or post to the newsgroup!

Back to Jordy's home page | Course home page