CS 162 Lecture Notes Spring, 2007 Prof. Alan Jay Smith Part 1 Topic 0: Introduction + Name, office, office hours, office phone, other material on introductory handout. Post first week office hours. Fill out questionaire. + Course Organization: + Lecture: discuss concepts, compare existing and proposed solutions. + Discussion sections will handle everything to do with programming and homework assignments. + Explain course admission policy. + This course uses things from many other courses: languages, hardware, data structures, algorithms. + ``Operating system'' is a hard term to define. Also called the "Control", "Monitor", "Executive", or "Supervisor". The discipline arose historically from a set of problems. It's easiest to introduce OS'es by discussing their history. - 1.1 - + History Phase 1 (hardware expensive, humans cheap, computer primitive): + In the beginning, user worked at console to debug. (1948-1957) (Actually, the programs were entered with a paper tape reader, which was loaded with a boot loader keyed in from the console.) One user at a time, no over- lap of computation and I/O. OS first appeared as a sub- routine library shared by all users. On first machines, added card decks for the stuff you needed. + Discuss Maurice Wilkes film + Originally the OS was just some very simple subroutines, to do things like I/O. These subroutines were often linked at run time. + Note that the first thing that needs to be automated is I/O - imagine writing I/O code for everything you do! (I/O code would deal with everything at the byte level.) + History Phase 2 - machine now more powerful, but still expen- sive. Some real OS software exists: + The first machine I programmed on was an IBM 7040 running IBSYS, and using the MAD (Michigan Algorithm Decoder) language. We submitted card decks to the operator, who loaded them into a big card reader behind a glass wall; we could also see the printer. The "job queue" was the stack of decks in the card reader - you could see when your job was about to start. You could also see how far - 1.2 - it had run by how much had printed - there was no output spooling. + Note the picture of Alfred E. Neuman when an error occurred. + Simple batch monitor, get user away from the computer. OS = program to load and run user jobs, take dumps. Makes better use of hardware, but more difficult to de- bug. + No spooling. + One job at a time. No multiprogramming. + No protection. + FIFO Scheduling. (order of card decks in reader) + History Phase 3 - first "REAL" operating systems. E.g. OS/360. + Designed to provide all of the features felt to be neces- sary. All things for all people. To run shared computing utility. + Included other mainframe systems - e.g. Kronos, Scope (CDC), Exec8 (Sperry?), GECOS (General Electric --> Honeywell) + Typical features: + Data channels, interrupts, overlap of I/O and compu- tation. (define each term) Buffering and interrupt handling in OS. + Multiprogramming: several users share the system. + Real scheduling algorithms. Job classes (by memory - 1.3 - size and run time estimate.) Small jobs finish fast. + Batch processing. + Memory protection + relocation. (define each term) + Spool card reader input and printer output on drum. (define each term) + OS must manage interactions between concurrent things. + OS'es finally begin to be an important science. Peo- ple got interested in OS'es because they didn't work, e.g. Multics, OS/360) First Symposium on Operating Systems Principles was in 1969. + History, Phase 4 - Short comings of large batch OS's real- ized: (1965-1975) + Waste too much programmer time. + Systems too complex. + Systems becoming much cheaper per CPU cycle, people becoming more expensive. (Note that the crossover was in the 1970's, in terms of cost. Consider a $2M machine, with 4 year straight line depreciation. That is $57/hour. With operating costs, it must be at least $100/hour to run it. In 1970, a programmer made about $7/hour. So machine time was 14 times more valuable than programmer.) + Interactive timesharing. Developed in 60's (CTSS, TSO, Multics, IBM TSS). 1970-1983 for widespread use. + Terminals are cheap, so let the user interact with - 1.4 - the system. Two advantages: + 1. Much better use of people time + 2. Much faster software development. + Fancy filing systems. (You can keep data on the com- puter.) + Problems of response time (not enough CPU power), thrashing, (not enough memory). + History, Phase 5 - the minicomputer. (1970-83) + Cheaper machines, used by smaller groups and/or individu- als appeared. PDP-8 (1969) for dedicated use. PDP-11 for some time sharing. VAX 11/780 (1978) for widespread affordable time sharing. + Unix developed as time sharing system for minicomputer. Provided only simplified version of facilities in much smaller, cheaper, less complex piece of code. Gave up much of the efficiency and optimization of mainframe sys- tems. + History, Phase 6 - personal computing (1983-...) + Computers are so cheap, so each person has his/her own. Most operating system goes back to technology of early 1960s (mostly), and advances (with workstations) to 1970s. (User interfaces are newer.) + Systems don't need many of the features of mainframe systems - e.g. fancy scheduling, accounting, etc. + Computer Science is a new field, but it isn't that - 1.5 - new. Math goes back to Euclid, etc. CS goes back only to 1940s. + PC Development - in late 1970s, Commodore, Apple II. used 8-bit Micros (6502, 8080). IBM PC - around 1980-82 - used 8086. + Characteristics of current commercial OS'es: + Enormous: + 100k's of lines (or >1M lines) + WindowsNT is 20M lines. + Windows2000 and Windows XT are about 40M lines. + 1000-10000 man-years (++) + Complex + Asynchronous + Hardware idiosyncracies + Conflicting needs of different users + Performance is crucial + Poorly understood: + The system outlives any of its builders + Never fully debugged (OS/360 released each time with 1000 bugs) In the interim, they issued "PTF's" - pro- gram temporary fixes. + Behavior is hard to predict, tuning is often done by guessing (But great for research!) + Often unreliable - 1.6 - - 1.7 - Topic 1: Introduction to Processes; Dispatching + Operating systems have two general functions: + Coordinator: allow several things to work together in efficient and fair ways. + Resource allocation and management. + Sharing and protection. + This emphasizes things like scheduling algorithm, multiprogramming, I/O scheduling, deadlock preven- tion, etc. + "Extended Machine": Standard library: provide standard facilities that everyone needs. Provide a "nice inter- face". Bare machine only implements "POO" (principles of operation). Extended machine has file system, I/O, mul- tiple processes with scheduling, virtual memory, etc. Discussion of Coordination Much of this course will deal with the coordination aspect: making many things work well together. This appears in several different areas. + Concurrency. OS allows several users to be working at the same time, as if each had a private personal machine. Or, one user can be doing many things at the same time. To keep track of everything, notion of process was in- vented (to be defined soon). + I/O devices. Don't want CPU to sit idle while an I/O - 1.8 - device is working. (especially something slow like ter- minal) So, interrupts were invented: + Each I/O device has a little processor inside it so it can run independently. + CPU issues commands to I/O devices, then goes off to do other things. + When I/O device is finished, it issues interrupt to CPU: CPU stops whatever it was doing, and goes to special place to process I/O device request. + Interrupts complicate matters inside the operating system. How can OS make sure that an interrupt doesn't destroy the work that was interrupted? What happens when many interrupts occur at about the same time? + Memory. Each process needs to use some main memory. OS coordinates their usage so they can share it. It swaps information back and forth between disk and main memory so the system can run even if the total memory needed by all processes is greater than the size of the memory. + Files. Each user owns a collection of files. The OS coordinates how space is used for files so that they can all fit on the same disk. Protects files from access by wrong user. + Networks. Allow groups of workstations to work together. + Coordination is difficult because needs of users conflict: some processes need large amounts of memory, small amounts of - 1.9 - CPU time, some need lots of CPU time, but little memory, etc. OS must try to make all run reasonably well. How can we or- ganize and conceptualize coordination so that sharing is done in a satisfactory way? + One way to simplify sharing and coordination is to present the following illusions: + Single processor is made to look like many separate pro- cessors, one per user. + Single memory is made to look like many separate memories, one per process, each much larger than the real memory. + Give every user the illusion that he has his own, unshared machine, with own file system, I/O devices, etc. + This is much easier than making the sharing and coor- dination visible. + Sometimes the illusions don't work. + This is an example of decomposing a problem. Rather than make the sharing visible, give each use the illusion of unshared resources. + Important concept: decomposition. Given hard problem, chop it up into several simpler problems that can be solved separately. (Recursion is a variation on this theme.) + With many things happening at once in a system, we need some - 1.10 - way of separating them all out cleanly. That's a process. + What is a process? + ``An execution stream in the context of a particular pro- cess state.'' + A more intuitive, but less precise, definition is just a running piece of code along with almost all the [local] things that the code can affect or be affected by. + Actually, don't refer to "everything", but only "lo- cal things" - does not include other processes with which it may be communicating, etc. + Process state is [everything] that can affect, or be af- fected by, the process: includes code, particular data values, open files, etc. + Execution stream is a sequence of instructions performed by a process. + Is a process the same as a program? + No, it's both more and less. + A program is the statements that a user writes, or a com- mand he/she invokes. + More - a program is only part of the state; several processes may be derived from the same program. If I type ``ls'', something different happens than if you type it. + Less - one program may use several processes, e.g. cc runs other things behind your back. - 1.11 - + Good analogy is the difference between a script and a play. + Some systems allow only one process (mostly primitive (per- sonal) computers). They are called uniprogramming systems (not uniprocessing; that means only one processor). Easier to write some parts of OS, but many other things are hard to do. E.g. compile a program in background while you edit another file; answer your phone and take messages while you're busy hacking. Very difficult to do anything network- related under uniprogramming. + Most systems allow more than one process. They are called multiprogramming systems. + Note that the concept of process (which we will discuss further) seems to capture the concept we need for organizing coordination and sharing. Obviously, sharing and coordina- tion is important mostly when there are multiple processes. + First, for each process it is necessary to know certain in- formation. (That information varies from system to system.) When the process is not running, we keep that information in the process control block. We must save anything that could get destroyed in the mean time. For each process, process control block holds: + Program counter. + Processor status word (condition codes, etc.). (i.e. sys- - 1.12 - tem registers) (includes user/system status) + General purpose registers. + Floating-point registers. + Process ID + All of memory? + In practice, we have a pointer to the memory info, rather than storing all of it in the process control block. + Scheduling information + Accounting and other miscellaneous information. + Pointers to other information, such as lists of open files, file descriptors.etc. + The system organizes the process control blocks into the pro- cess table. + Process table: collection of all process control blocks for all processes. In Unix the process table is a fixed-size array. In MVS, (I think) it is kept as a linked list. + Recall, we would like to give the image of one CPU per pro- cess. + Obviously, a single CPU can only be running one process at any one instant. We achieve the illusion of multiple CPUs by switching between various processes at short intervals. + (Remember the difference between human time and machine time. If you get a response from the machine in <.5 - 1.13 - seconds, that is effectively instantaneous. On the other hand, a 2500 MIPS (2500 million instructions per second machine) can run 1250000000 instructions in that time. If your request only required 1,000,000 instructions, then 1250 requests could have been processed in that period.) + What is necessary if sharing the CPU is to work? The OS must make sure that processes don't interfere with each other. This means + Making sure each gets a chance to run (they don't wait indefinitely.) + Making sure they don't modify each other's state (protec- tion). + Making sure that they synchronize properly with each oth- er. + Dispatcher/Scheduler: the portion of the OS that decides and initates which process will run at any given time. (This is one of the "inner-most" parts of the OS.) + (Actually, the scheduler usually decides, and the dispatcher initates the process. Like strategy vs. tac- tics.) + Typically, the dispatcher functions in the following loop: + Run process for a while + Save state + Load state of another process - 1.14 - + Run it ... + How does dispatcher decide which process to run next? + There are several issues here: + Note that we can only schedule jobs that are "ready" (runable). + We would like to have some way to select the "best" process to run to get the "best" performance. This is the "scheduling" problem, to be discussed later. + We would like to select a process quickly, according to some criteria (to be discussed later) - data structure issue. E.g. we could search the process table linearly, but this might be slow. We could have a linear queue, and service jobs in a round- robin fashion. + CPU can only be doing one thing at a time: if user process is executing, dispatcher isn't: OS has lost control. + How does OS regain control of processor? + One approach: sleeping beauty (hope process is Prince Charming and will wake you up). + Another approach: alarm clock (make sure you get woken up). What are alarm clocks? Interrupts. + Some action, either by the process itself, or by something external to the process, must cause the OS and/or the dispatcher (which is part of the OS) to take control. These - 1.15 - events are typically part of the set of events called "EIT". + To regain control of CPU, generally rely on a timer in- terrupt. + Interrupt is an example of a class of operations called EIT - Exceptions, Interrupts and Traps. + Traps are synchronous events in the machine, such as a page fault, divide by zero, SVC, physical memory error, out of bounds addressing, hardware error, illegal in- struction, etc. Some traps require that the instruction abort; others can be handled after the instruction com- pletes. (e.g. trace trap.) + Interrupts are asynchronous events. Typically I/O inter- rupts - the I/O device signals that it is done. Includes timer. + The term exception refers to both interrupts and traps. + External events are usually called interrupts. They all cause a state switch into the OS. + Why? - because handling interrupts usually requires ac- cess to very sensitive parts of the machine state such as control registers. If the user program could touch that machine state, there would be no protection. User could destroy others' files, etc. + This means that user processes cannot take I/O inter- rupts as they occur, although can write handler to which some interrupts can be give (e.g. floating - 1.16 - point) + To regain control of CPU, generally rely on a timer inter- rupt. + Three types of timer: + Periodic - e.g. 60 times/second. (high overhead, low resolution) + Time of day - interrupt when timer = TOD clock + Elapsed Time (interval timer) - interrupt when timer de- crements to zero. + How do we switch contexts between the user and OS? Must be careful not to mess up process state while saving and restor- ing it. + Saving state: it's tricky because the the OS needs some state to execute the state saving and restoring code. All machines provide some special hardware support for saving and restoring state: + Consider the problem of reloading the PSW (mode bit, ad- dress space pointer, protection info) and the program counter - no matter which order they're reloaded, results will not be correct. Must reload both at once, using special hardware support. + PDP-11: hardware doesn't know much about processes, it just moves PC and PS to/from the stack. OS then transfers to/from PCB, and handles rest of state itself - 1.17 - (must be done carefully, using hand-coded assembler, to avoid overwriting state while saving it). + Intel 432: hardware does all state saving and restoring into process control block, and even dispatching. + CLIPPER: separate set of registers for user and supervi- sor. The supervisor saves the user registers with no trouble - the supervisor is allowed to refer to the user registers, but not the converse.. + In general, the machine hardware saves the essentials of the state - including the PC, and PSW. It then directs through a vectored trap/interrupt table to the right routine. + Suppose we want a new process. One approach is to create a process from scratch: + Creating a process from scratch: + Create (empty) call stack. + Create and initialize process control block (or reuse ex- isting one). + Load code and data into memory. + Make process known to dispatcher - put it on some list of processes. + Another is to make a copy of an existing process. This in Unix and Toy is called Forking a process. (The word "fork" is used to indicate a bifurcation.) + Forking: want to make a copy of existing process. (warn- ing: this may not be quite the same as fork from the book - 1.18 - or fork from a programming language fork-join pair) For example, the shell makes a copy of itself for each com- mand. One waits around, the other goes off and executes the command. + Three steps to Unix fork: + (1) Allocate and initialize new PROC structure for the child process. (Give it a new process ID.) + (2) Duplicate the context of the parent process (in- cluding the user structure and virtual memory resources) for the child. + Child process gets open files, signal state, the scheduling parameters (e.g. "nice"), disk quota info. + (3) Schedule the child process to run. + What's missing? + The two processes are exactly the same, so this isn't very interesting. Don't usually want to do the same thing twice. + The usual way a fork is done in Unix is that after the process is "copied", some other program is copied into that process' memory, and executed (exec). + We aren't that interested in having two copies of that pro- gram. + Improvement is to use VFORK, which doesn't copy the ori- ginal process before overlaying the new one. Saves a copy operation. - 1.19 - Topic: CPU Scheduling + OS makes two related kinds of decisions about resources: + Allocation: who gets what. Given a set of requests for resources, which processes should be given which resources in order to make most efficient use of the resources, while avoiding deadlock? Implication is that resources aren't easily preemptible. + Scheduling: providing preemptible resources - usually CPU and memory. When more resources are requested than can be granted immediately, in which order should they be serviced? Examples are processor scheduling (one proces- sor, many processes), memory scheduling in virtual memory systems. Implication is that resource is preemptible. + Resource #1: the processor. + Processes may be in any one of three general scheduling states: + Running. + Ready. That is, waiting for CPU time. Scheduler and dispatcher determine transitions between this and running state. + Blocked. Waiting for some other event: disk I/O, mes- sage, semaphore, user input, etc. + See figure - 1.20 - + For the purposes of understanding scheduling, we want to create a more sophisticated model of how jobs and processes are handled: + See figure + Job - all of the computations associated with a single "sub- mission" - e.g. one command from your terminal. That command may result in several processes - e.g. a pipe, etc. May in- voke a whole command file. The job is the unit of schedul- ing. + Note - in queueing theory, one uses the term "job". When talking about synchronization and deadlock, one talks about "processes". We really mean the same thing. In both cases, this is the unit of scheduling. + Hold Queue - holds jobs, usually batch, until the system is prepared to give them some service. Used to avoid overload- ing the machine. Jobs can be held here to avoid deadlock. Resources are allocated here. + Runable Queue - processes which are considered for schedul- ing, but which are not in memory yet. + In Memory Queue - processes which are (mostly) in memory; can be scheduled very cheaply. + Functions of job scheduler and dispatcher: + Keep track of job status (which queue, elapsed time, - 1.21 - priority, etc.) + Choose which jobs will "run". (We'll discuss this furth- er.) + Allocate necessary resources (memory, I/O devices, etc.) + Deallocate resources when necessary. + The terms scheduler and dispatcher are often used interchang- ably. Sometimes a distinction is made, as follows: + The (job) scheduler usually determines which jobs go from the hold-queue to the runable-queue, and the runable queue to the in-memory-queue. + The dispatcher usually determines which process in the in-memory-queue gets to actually run. + Note - most of the time, we won't distinguish between "scheduler" and "dispatcher". + Goals for scheduling disciplines: + Efficiency of resource utilization. (keep CPU and disks busy) + Minimize overhead. (context swaps) + Minimize response time. (Define response time.) + Maximize throughput + (Define throughput) (Note - throughput in open system is invariant with scheduling algorithm - covered later.) + Provide guaranteed fraction of machine to specific groups. - 1.22 - + Minimize number of waiting processes, or interactive users. + Get short jobs through quickly + Avoid Starvation (note - real "starvation" only occurs if rho>1. We can call this "hunger") + Minimize variance of response time - i.e. make run time predictable, as function of CPU time required. + Note that this is different than minimizing variance of run time when averaged over all jobs. + Satisfy external constraints. + Meet Deadlines + Graceful degradation under overload + Provide Fairness??? - NO!! - both round robin and first- come, first serve are fair, but are not similar. + Real goal: maximize user satisfaction, or (or more real- istically) minimize complaints. (Remember that not all complaints are equally valid - who pays the bills?) + BUT - we need an objective measure to optimize - can't run psychological tests - use one of the above. + Typically use minimum flow time (also called minimum waiting time) - minimize the time from the submission of the job (command) to completion. f(i) is flow time of i'th job. Min ave(f(i)). + Secondary goal is to minimize the variance of f(i)/s(i) where f(i) is flow time of i'th job, and s(i) is service time of i'th job. - 1.23 - + Typical external constraints + Fixed priorities - determined administratively, or by purchase. (I.e. pay for a certain priority.) e.g. Liver- more. + Fixed fraction of machine - e.g. Livermore. + Realtime - e.g. shoot down a missile, control an indus- trial process. (Deadlines) + Open vs. Closed Systems + Open system is one in which arrival rate of customers in system is unaffected by number in system (in queue) + Closed system is one in which total number of customers (in system, and in external world) is constant, so the more that are in the system, the fewer that can be arriv- ing. + Throughput - number of jobs completed per second. + Throughput in open system is invariant with regard to scheduling algorithm (if rho<1). + Explain (if rho<1) + Throughput in closed system varies with scheduler. + See figure. + User Characteristics + The longer that a request should take, the longer the user is willing to wait. I.e. an "ls" or "jobs" command - 1.24 - should get an instant response. For a compile or troff, the user is willing to wait a while. For a command that might take 10 times as long, the user is usually willing to wait more than 10 times as long. + Users like predictability. + (I.e. if the user knows it will take a while, he'll think about something else, or go to lunch, etc.) + A user at a terminal is very impatient. + For discussion of scheduling, we will use the simple model of figure 1 - the scheduling problem is to decide which of the ready processes actually gets to run. The complexity of the full model of figure 2 is too much. + What is the simplest possible scheduling algorithm? + FIFO (also called FCFS): run until finished. + In the simplest case this means uniprogramming. + Usually, ``finished'' means ``blocked''. One process can use CPU while another waits on a semaphore. Go to back of run queue when ready. + Advantages: + Lowest variance of overall waiting time (f(i)), "fair", simple, low overhead (no switching) + Problem: one process can monopolize CPU. Short jobs wait for long jobs. Minimal resource sharing - if we gave priority to jobs that needed other resources, they would finish with the CPU quickly and do something else. - 1.25 - + Provides high variance of f(i)/s(i). Low variance of (f(i)). + What are jobs like? Are there really short jobs and long jobs? + Job Characteristics + Job run times are highly skewed. I.e. there are lots of short jobs and a few long jobs. (Show distribution curve.) Let r(t) be the run time distribution. + The expected (!) time that a job has remaining to run is an increasing function of the time it has already run. + The expected time to the next I/O is an increasing function of the time since the last I/O. + The expected time to a page fault is an increasing function of the time since the last page fault. + The expected time to a trap or interrupt is an in- creasing function of the time since the last trap or interrupt. E(t) = integral {from t, to infinity, of [((x- t)r(x))/(1-R(t))]dx} where r(t) is run time distribution, R(t) is cumula- tive, E(t) is expected time to completion - given run time t so far. - 1.26 - + (Example - someone always tells you that something will be done "tomorrow" - after a while you believe it will be quite a while.) + Solution: limit maximum amount of time that a process can run without a context switch. This time is called a time slice. + Round Robin: run process for one time slice, then move to back of queue. Each process gets equal share of the CPU. Most systems use some variant of this. What happens if the time slice isn't chosen carefully? + Too long: one process can monopolize the CPU. (In the limit, becomes FCFS.) + Too small: too much time wasted in context swaps. + Overhead: swap CPU state and cache + Advantages: + Allows short jobs to get through quickly, "fair", reason- ably simple, low variance of f(i)/s(i). No starvation. + Disadvantages: + Relatively high overhead, not as good as some other schemes. + Originally, Unix had 1 sec. time slices. Too long. Most timesharing systems today use time slices of around .1 to .5 seconds. - 1.27 - + Round Robin can be used to implement priorities: + Can give different priority processes different time slices. + Even round-robin can produce bad results occasionally. + Example of ten processes each requiring exactly 100 time slices. + They each take about 1000 time slices to finish, whereas in FIFO they would average 500 time slices to finish. It is fair, but uniformly inefficient. + We're using as our goal minimizing flow time. That can be shown to be equivalent to minimizing the number of users in the system. This is due to a formula called Little's Formu- la, which says: N = lambda*F + Where + N is the mean (average) number of users in the system, + F is the mean (average) flow (waiting) time + lambda is the arrival rate of users (1/lambda = mean time between arrivals). (First guy to prove this formula correct was J.D.C. Little.) (Also L=lambda*W.) + Note - we are always assuming rho< 1; i.e. the system is not overloaded. + This implies that to minimize the flow time, we can do some- - 1.28 - thing that minimizes the number of users in the system. + The obvious solution is to use Shortest Job First (SJF) or Shortest Processing Time (SPT) scheduling - run the job (to completion) which has the shortest processing time. This is the fastest way to get users out of the system, and reduce N. + Advantages: + Optimal (minimum average flow time) if no preemption, no knowledge of future arrivals. (I.e. minimizes mean flow time). + Disadvantages: + Requires knowledge of the future + Could do much better if used preemption - with SJF, new arrivals wait, even if they are short. + High variance of flow time- long jobs wait. + What is the very best we can do? SRPT: Shortest Remaining Processing Time - with preemption. Differs from SJF because we can preempt job when shorter one arrives. + Advantages: + This minimizes the average flow time (response time, waiting time). + Disadvantages: + Requires future knowledge + Overhead due to preemptions (but only one per job) + High variance of flow time; long jobs wait. + Unfortunately, SRPT requires knowledge of the future. In- - 1.29 - stead, we can use past performance to predict future perfor- mance. + Remember - If a process has already taken a long time, it will probably take a long time more. + This suggests an algorithm called Shortest Elapsed Time (SET). Run the job that has run the shortest amount of time so far - can approximate by time slicing between equally long run processes. + Advantages: + Permits short jobs to get out fast. + Variance of f(i)/s(i) reasonably low(?). + Disadvantages: + High overhead, very bad if run time distribution is not highly skewed (i.e. if most jobs have the same run time), high variance - long jobs wait - high variance of f(i). + Foreground/Background (FB) scheduling - have two queues. First queue (foreground) has priority over the background. (see figure 3). + Any scheduling algorithm can be used for each of the queues, but typically, there is a time slice. Jobs that have run for a while are placed back in either the fore- ground or background queue. + Assignment to foreground or background can be decided in - 1.30 - various ways: e.g. [interactive/batch], [foreground(unix) / background(unix)], [I/O intensive / computation inten- sive], [newly arrived, long running], [non-deadline, deadline]. + Allows good performance - can assign to foreground or background based on performance considerations. Can also use to express priorities. + Foreground / background is a very good way to get CPU - I/O overlap - put I/O intensive processes in foreground. + Multilevel Foreground / Background (MLFB) - like foreground / background, but with several queues, not just two. (see fig- ure). + Jobs can be assigned to a level based on any of the above criteria - e.g. run times, priorities, use of I/O, etc. + Exponential Queue: - A special type of MLFB queue. Also called multilevel feedback queues. + Give newly runnable process a high priority and a very short time slice. If process uses up the time slice without blocking then decrease priority by 1 and double time slice for next time. + Reason for increasing quantum is to decrease over- head. Less switching overhead, and less overhead to reload into memory. I.e. once job is promoted to in-memory queue, let it run for a while, to amortize overhead. - 1.31 - + The CTSS system (MIT, early 1960's) was the first to use exponential queues. - used 4 queues. + You can further adjust this algorithm (as described below in "fair share scheduling") by allowing the process to gain merit (move up in priority) the longer it waits. Something like this used in TSO. This is general ap- proach to avoiding starvation - low priorities move up with waiting time. + Adaptive Scheduling Algorithms - ones which change the prior- ity of the job depending on its run time behavior - e.g. FB, MLFB, SET, Exponential Queue, etc. + Another scheduling algorithm - sometimes called "Fair-share scheduling" (similar to what's implemented in VAX Unix): + Keep history of recent CPU usage for each process. + Give highest priority to process that has used the least CPU time recently. Highly interactive jobs, like vi, will use little CPU time and get high priority. CPU- bound jobs, like compiles, will get lower priority. + Could have formula like: priority = wait_time - 10*cpu_use + TSO uses something like this. + Can adjust priorities by changing ``billing factors'' for processes. E.g. to make high-priority process, only use half its recent CPU usage in computing history. + A fair share scheduler can be implemented in an MLFB or - 1.32 - Exponential Queue system by promoting jobs as they wait, and demoting them as they use CPU time. + BSD Unix Scheduling + Uses multilevel feedback queues. There are 128 priority levels, but to minimize overhead, processes are grouped into groups of 4 levels, so implemented with 32 queues. + System runs job at front of highest priority occupied queue. Job is run for a quantum. When a job exhausts its quantum, it is put on the back of the same queue. Time quantum for 4.3BSD is 0.1 seconds - found to be longest that didn't produce "jerky" response. A higher priority process is only started when the quantum of the currently running process ends (except that a sleeping process that wakes up can preempt the running process after a system call.). DEC Unix is about the same. + User priority = PUSER + PCPU + PNICE (approximately) + PUSER - based on the type of user - e.g. OS processes are higher priority. + PCPU - weighted load factor that increases as this process uses CPU time. (Uses decay filter.) + PNICE - is number that can be used to "nice" (reduce in priority) a job. + Priority levels 0-49 reserved for system processes. 50- 127 for user processes. + Hypothesis- DEC Unix uses very large quantum for workstations - 1.33 - to make them unusable for timesharing. + Scheduling Countermeasures: + Most successful scheduling algorithms assume that there are lots of short jobs and a few very long jobs; give priority to short jobs. + A general approach to "beating" scheduling algorithms is to make long jobs look like lots of short jobs. (e.g. one big compile --> many small compiles; one big troff -->many small troffs) + Do spurious I/O (to /dev/null), to get promoted to higher queue. + This is considered antisocial - don't do it! + Summary: + In principle, scheduling algorithms can be arbitrary, since the system should produce the same results in any event. + However, the algorithms have strong effects on the the system's overhead, throughput, and response time. + The best schemes are adaptive and preemptive. To do ab- solutely best, we'd have to be able to predict the fu- ture. + Best scheduling algorithms tend to give highest priority to the processes that need the least! - 1.34 - Discrete Event Simulation + Event List + LOOP: get event, update stats, update state, update event list, continue. + M/G/1 example. - 1.35 -