CS162 Lecture 6: Monday, February 7, 2005 Topic: Synchronization with Condition Variables. Unix implementation. Monitors. Semaphore implementation, disabling interrupts. Deadlock deadlock prevention Announcements: UPE CS HONORS SOCIETY has its first general meeting on Wednesday 2/9/05 7pm at 310 soda. People who are interested can check upe.cs.berkeley.edu for eligibility. ( I have checked through the website and its said that "Typically, the top 1/3rd of undergraduates in the Computer Science major are invited to join UPE.") There will be food served and people can find information about research opportunities. UPE CS HONORS SOCIETY also offer tutoring and sell CS t-shirt for $12. There are some background readings about thread in the reader, which might be helpful for nachos phase 1. (The paper by Birrell) Anyone have questions on nachos, always ask TAs. + Process Synchronization with Condition Variables + Processor or threads can cooperate using wait and signal, along with condition variables. + The operation x.wait means that the process invoking it waits until some other process invokes x.signal. + The x.signal operation resumes exactly one suspended process. If no process is suspended, then x.signal has no effect. + x.signal and x.wait are used to control synchronization within Monitors, which is a special type of critical region. Only one process can be executing in a monitor at a time. + There is one binary semaphore associated with each monitor, mutual exclusion is implicit: P on entry to any routine, V on exit. (Always read through examples to understand this, I found that the readings in our textbook 7.7 helps understanding. Especially, the example on dining-philosophers problem in fig 7.26) + Monitors are a higher-level concept than P and V. They are easier and safer to use. + Monitors are a synchronization mechanism combining three features: + Shared data. + Operations on the data. + Synchronization, scheduling. They are especially convenient for synchronization involving lots of state. + Monitors need more facilities than just mutual exclusion. Need some way to wait. + Busy-wait inside monitor? + Put process to sleep inside monitor? Answer: The way that monitor usually work is that at most one process can be run inside monitor. When one goes to sleep that unlocks the monitor and another process can execute within it + Condition variables: things to wait on. + Wait(condition): release monitor lock, put process to sleep. When process wakes up again, re-acquire monitor lock immediately. (still one process is executing at a time) + Signal(condition): wake up one process waiting on the condition variable (FIFO, depends on the design). If nobody waiting, do nothing (no history). + Broadcast(condition): wake up all processes waiting on the condition variable. If nobody waiting, do nothing. (If you are waiting for a condition to be woken up, you should always test for the condition again because between the time the wake up was set and the time you resume the execution the condition may no longer be valid.) + There are several different variations on the wait/signal mechanism. They vary in terms of who gets the monitor lock after a signal. Our scheme is called ``Mesa semantics'': + On signal, signaller keeps monitor lock. + Awakened process waits for monitor lock with no special priority (a new process could get in before it). This means that the thing you were waiting for could have come and gone: must check again and be prepared to sleep again if someone else took it. + Readers and writers problem with monitors: each synchronization operation gets encapsulated in a monitored procedure: checkRead, checkWrite, doneRead, doneWrite. Use conditions OKToRead, OKToWrite. + This is all part of one monitor (AW = Active Writers, WW = Waiting Writers, WR = Waiting Readers, AR = Active Readers) + checkRead() { if ((AW+WW) > 0) { WR = WR+1; wait(OKToRead); WR =WR-1; } AR = AR+1 ; READ } + doneRead() { AR = AR-1; if (AR==0 & WW>0) signal(OKToWrite); } + checkWrite() { while ((AW+AR) > 0) { WW = WW+1; wait(OKToWrite); WW = WW-1; } AW += 1; WRITE } + doneWrite() { AW -= 1; if (WW > 0) signal(OKToWrite); else broadcast(OKToRead); } + Producers and Consumers (from Hoare): (This example is in the reader. The producer is the guy to append things to the buffer and the consumer is the guy to remove things from the buffer. The whole thing is the monitor. There are two procedure, but you will have only process run at a time.) bounded buffer: monitor begin buffer: array 0..N-1 of portion last pointer:0..N-1; count:0..N; nonempty, nonfull: condition; procedure append(x; portion); begin if count==N then nonfull.wait; buffer[lastpointer] =x; last pointer = (lastpointer + 1) mod N count = count+1; nonempty.signal end append; procedure remove (result x; portion); begin if count==0 then nonempty.wait; x=buffer[(lastpointer - count) mod N]; count=count-1; nonfull.signal; end remove count=0; lastpointer=0; end bounded buffer + Disk Head Scheduler (from Hoare): (If you find difficult to understand as we have learnt about disk, just think of scheduling an elevator) (Don't focus on the detail, just see how the sychronization works) procedure request- called before issuing command to move head to dest. (get on the elevator) procedure release - called after cylinder is finished. (get off the elevator) headpos - current location of head sweep - up or down - direction of head movement busy - whether disk is busy diskhead: monitor begin: headpos: cylinder; direction: (up, down); busy: Boolean; upsweep, downsweep: condition; procedure request(dest:cylinder); begin if busy then [if {((headpos < dest) or [headpos == dest & direction==up]) then upsweep.wait(dest) else downsweep.wait(dest)}]; else [busy=true; headpos=dest;] end request; procedure release; begin busy=false; if direction==up then if {upsweep.queue then upsweep.signal else {direction=down; downsweep.signal}} else if downsweep.queue then downsweep.signal else {direction=up; upsweep.signal} end release; headpos=0; direction=up; busy=false; end diskhead + Summary: + Not present in very many languages, but extremely useful. + Semaphores use a single structure for both exclusion and scheduling, monitors use different structures for each. + Monitors enforce a style of programming where complex synchronization code doesn't get mixed with other code: it is separated and put in monitors. + A mechanism similar to wait/signal is used internally in Unix for scheduling OS processes. + Unix: (from the 4.4 BSD book) (we are not responsible for this in exams, but this is some helpful information) + Unix uses generalized semaphores. They are created in sets. Several operations on semaphores can be done simultaneously, and increments and decrements can be by values greater than 1. The kernel does these operations atomically. + Associated with each semaphore is a queue of processes suspended on that semaphore. + The semop system call takes a list of semaphore ops (sem-op), each defined on a semaphore in the set, and does them one at a time. + If sem-op is positive, kernel increments value of semaphore and awakens all processes waiting for the value of the semaphore to increase + If sem-op is zero, kernel checks semaphore value. If 0, continues with list. Otherwise, blocks process, and has it wait on this semaphore. + If sem-op is negative, and its absolute value is less than or equal to the semaphore value, the kernel adds sem-op (a negative number) to the semaphore value. If the result is 0, the kernel awakens all processes waiting for the value of the semaphore to equal 0. + If sem-op is negative, and its absolute value is greater than the semaphore value, the kernel suspends the process until the value of the sema- phore increases. + Unix also uses signals. Processes may send each other signals, which are software interrupts. A signal is pro- cessed when a process wakes up, or when it returns from a system call. There are about 20 defined signals. (e.g. interrupt, quit, illegal instruction, trace trap, float- ing point exception, kill, bus error, segmentation viola- tion, write on a pipe with no readers, alarm clock, death of a child, power failure, etc.) Topic: Semaphore Implementation ( I found that the readings in our textbook 7.5 might help understanding, especially in 7.5.2) + Unfortunately, no existing hardware implements P&V directly. P&V are complex and implicitly require scheduling. This is + (a) Too complicated to put in hardware. + (b) Makes it too inflexibile - don't want scheduling in hardware. + (c) Too hard to make atomic - would be dozens or hundreds of instructions. + (d) Too long - too long to disable interrupts, etc. + Need a simple way of doing mutual exclusion in order to implement P's and V's. We could use atomic reads and writes, as in ``too much milk'' problem, but these are very clumsy. Still okay, and can be used in the absence of anything better - note that the too much milk solu- tion does produce a critical section, which is what we need. But so far, only works for 2 processes - what about N? + Uniprocessor solution: disable interrupts. + Remember that the only way the dispatcher re- gains control is through interrupts or through explicit process request. + Note that disabling interrupts is a supervisor call. + Can we really disable ALL interrupts? There are usually some that can't be disabled. (such as power fail) + User can't disable interrupts - must ask system to do it. Makes every P & V a system call. + Which process should we remove from the queue and wake up? That is determined by the scheduling algorithm. + What do we do in a multiprocessor to implement P's and V's? Can't just turn off interrupts to get low-level mutual exclusion. + Turn off all other processors? + Use atomic read and write, as in ``too much milk''? + The atomic read, or atomic write solution is too complicated. + Most machines provide some sort of atomic read-modify- write instruction. Read existing value, store back in one atomic operation. + We will use this atomic operation to create a busy- waiting implementation of P and V. + E.g. Atomic increment value in memory, and then load, and decrement value in memory. + Operations are + {increment to value in memory, load incre- mented value}, + {decrement value in memory}. + For busy waiting, can do the following: + Init: A=0 + loop: {increment A in memory, load A}, + if A~=1, then {decrement A in memory}, go to loop + critical section here + {decrement memory location A} + Note that this solution is susceptible to inde- finite postponement - i.e. with N processes (N large). In simple case, N=3, value oscillates between 1, 2 and 3. If one leaves, can oscil- late between 0, 1 and 2. With "after you" al- teration, can oscillate between 1 and 2, and never terminate. + E.g. Swap (this method works, but a little complicated than necessary. We don't actually need to swap the local into lock, as we are always storing the same thing into the lock. The slightly simplier version is the next example, which is the test and set) + Operation is: swap(local(i),lock) - interchanges values of two variables. Lock is locked if it is "true". + local(i) is local variable for process i; lock is shared among all processes. (lock is global variable) + Busy waiting loop: + init: lock=false + local(i)=true + repeat swap(local(i),lock) until local(i)==false; + Critical section here + lock=false; + E.g. Test and set (IBM solution). Set value to true, but return OLD value. Use ordinary write to set back to false. Lock is locked if it is "true". + i.e. Tset(local(i), lock): {local(i)=lock; lock=true.} + (for process i) + Busy waiting loop: + Init lock=false + repeat(Tset(local(i),lock) until local(i)==false; + Critical section + lock=false. + Read-modify-writes may be implemented directly in memory hardware (e.g. IBM S/360), or in the proces- sor by refusing to release the memory bus (PDP-11). + Has to be done in some central place, so it's not possible for somebody to sneak around the back. + Using test and set for mutual exclusion: + It's like a binary semaphore in reverse, except that it doesn't include waiting. 1 means someone else is already using it, 0 means it's OK to proceed. + Definition of test and set prevents two processes from getting a 0->1 transition simultaneously. + Test and set is tricky to use, since you can't get at it from HLLs(High Level Languages). Typically, use a routine written in assembler. + Using test and set to implement semaphores: For each semaphore, keep a test-and-set integer in addition to the semaphore integer and the queue of waiting processes. (This is multiprocessor system.) P(S) Disable interrupts must disable interrupts for this processor. Still want uninterrupted se- quence of operations on this processor. - see below (P is the lock, S is the semaphore) local(i):=T Repeat (Tset(local(i),S.Lock)) until local(i)==false must spin on lock, since we are waiting for other processor. Alternate attempt below. If S>0, then {S:=S-1, S.Lock=false; enable interrupts, return} Add process to S.Q S.Lock:=false Enable interrupts call dispatcher V(S) Disable interrupts local(i):=T Repeat (Tset(local(i),S.Lock) until local(i)==false if (S.Q is empty) S:=S+1 else {remove process from S.Q; wake it up} S.Lock:=false Enable interrupts + Why do we still have to disable interrupts in addition to using test and set? (actually, it still works even if we don't disable interrupts, but really inefficient) + Suppose that a process A is in the middle of either P(S) or V(S), and is interrupted. The dispatcher then picks process B to run. If B attempts to do either P(S) or V(S), it will find the lock set and spin, indefinitely - until process A is rescheduled and is allowed to finish and reset the lock. + Note that ``spin-lock'' for P&V is not a problem - only takes about 10 instructions - not a long wait - much less than time to do a dispatch. + Note that we can always get a critical section using the "too much milk" solutions from earlier. That critical section can be used to implement P and V also. + Important point: implement some mechanism once, very carefully. Then always write programs that use that mechanism. Layering is very important. Topic: Deadlock (Reading in Chapter 8 of the textbook will be helpful) (I found that 8.2, 8.4 and 8.5 are especially useful) + Until now you have heard about processes. From now on you'll hear about resources, the things operated upon (i.e. used) by processes. Resources range from cpu time to disk space to channel I/O time. + Resources fall into two classes: + Preemptible: processor, I/O channel, main memory. Can take resource away, use it for something else, then give it back later. + Non-preemptible: once given, it can't be reused un- til process gives it back. Examples are file space, terminal, printer, semaphores. + This distinction is a little arbitrary, since anything is preemptible if it can be saved and restored. It generally measures the difficulty of preemption. + OS makes two related kinds of decisions about resources: + Allocation: who gets & keeps what, for non- preemptable resources. Given a set of requests for resources, which processes should be given which resources in order to make most efficient use of the resources? + Scheduling: - for preemptable resources: how long can they keep it. When more resources are requested than can be granted immediately, in which order should they be serviced? Examples are processor scheduling (one processor, many processes), memory scheduling in virtual memory systems. + System typically has resources. + Process loop is typically: (a) request resource, (b) use resource, (c) release resource. + Deadlock: Analogy to Traffic Jam: + see figure | | | | | | | | ---------| |---------------| _ |------------- /| /| /| /| /| \ / /| /| \| \| \| \| \| _ \| \| ---------| /_\ |---------------| \ / |------------- | | | _ | | /_\ | | \ / | | | | _ | ---------| /_\ |---------------| \ / |------------- |\ |\ |\ |\ |\ /_\ |/ |/ |/ |/ |/ ---------| |---------------| |------------- | /_\ | | | | | | | This is a traffic intersection. The triangles are cars. "Nobody can move because everybody has to get out of the way before anyone else can move." Therefore, this is a deadlock. + Deadlock example: semaphores. + Two processes: one does P(x) followed by P(y), the other does the reverse. + In this case, the resources are the semaphores X and Y. + Deadlock is one area where there is a strong theory, but it's almost completely ignored in practice. Reason: solutions are expensive and/or require predicting the future. + Deadlock: a situation where each of a set of processes is waiting for something from other processes in the set. Since all are waiting, none can provide any of the things being waited for. + This is relatively simple minded. We could have: + (a) processes waiting for resources of the same type - e.g. there are 4 tape drives. Each process needs 3. Each holds 2, and waits for another. + (b) More than 2 processes. E.g. A has x, B has y, C has z, A needs y, B needs z, and C needs x. + (c) could be waiting for more pieces of a resource - e.g. each needs a certain amount more memory, and holds memory in the mean time. + In general, there are four conditions for deadlock: (We consider only situations with a single example of each resource) (these conditions are NOT completely distinct, but is helpful to list in four separate ones) + Mutual exclusion: two or more resources cannot be shared. The requesting process has to wait. E.g. tape drives, printers. + Hold and wait - the process making a request holds the resources that it has, and waits for the new ones. This also implies multiple independent re- quests: processes don't ask for resources all at once. + No preemption. Once allocated, a resource cannot be taken away. + There is a circularity in the graph of who has what and who wants what. Let Ri be a resource. The graph has an arrow from Ri to Rj if there is a process holding Ri and requesting Rj. (Can call this a hold and wait graph or resource request graph.) + (This condition actually implies the hold and wait condition.) A --------> A ( x) (y) OR (x) ---------------> (y) <-------- ^ / B C \ / B \--- (z) <----/ The resource request graph. A, B and C are processes, while x, y and z are resources. What makes deadlock happens is that there is a cycle in the graph. + Approaches to the deadlock problem fall into two general categories: + Prevention: organize the system so that it is im- possible for deadlock ever to occur. May lead to less efficient resource utilization in order to guarantee no deadlocks. (you can prevent by using uni-programming, but less efficient) + Cure: determine when the system is deadlocked and then take drastic action. Requires termination of one or more processes in order to release their resources. Usually this isn't practical. + Deadlock prevention: must find a way to eliminate one of the four necessary conditions for deadlock: + Create enough resources so that there's always plen- ty for all + It may not be possible to make enough of some resources, e.g. printer, tape drive, etc. + Make all resources shareable. + Some resources are not shareable - e.g. printer, tape drive. + Don't permit mutual exclusion + not feasible + Virtualize non-shared resources (e.g. printer, card reader - spooling) (this does work) + Use only uniprogramming + Don't allow waiting and holding. + Process must crash if resource not immediately available. + Phone company does this - if your call doesn't go through, it is dropped. + Process must request everything at once. (re- quires sufficient knowledge - is this feasible?) + Probably will end up requesting too much, just to be safe. E.g. too much memory, or both input and output devices all at once, even though they are not used at same time. + Starvation is possible - may wait indefin- itely long until everything you need is free. + May not be possible - e.g. system has 2 tape drives. Need 2 for input and 2 for output. Must complete input and then request output drives. Can't ask for all at once. + Process must release all current resources be- fore requesting any new ones. (Some resources may not be releasable and reacquirable, without restarting the process - e.g. printer, tape drive. Sometimes, process may not be restart- able - e.g. you have modified a database record - e.g. balance in checking account.) + Allow preemption. + Some things can be preempted - e.g. memory, CPU (registers can be saved and restored), disk. + Make ordered or hierarchical requests. E.g. ask for all R1's, then all R2's, etc. All processes must follow the same ordering scheme. In that case, for there to be an arrow from Ri to Rj, j>i; i.e. there can't be a circular wait. We can call this the resource request graph. Of course, for this you have to know in advance what is needed. (this works too)