////////////////// cs 162- Lecture 3 ////////////////// /////////////////////////////////////////////////////////////////////////////////////////// CPU Scheduling: /////////////////////////////////////////////////////////////////////////////////////////// CPU Scheduling's purpose is to satisfy the needs of the user by efficently alloting computer resources to different jobs. This often means that CPU scheduling must: -Minimize Flow Time - the time between job arrival and completion -Give it some Predictibility - Users expect long jobs to take longer to service (compiling code), while short jobs should be serviced in a very short amount of time (ls) and appear almost instantaneously (<50ms). Basically The longer that a request should take, the longer the user is willing to wait. For example people won't get angry waiting for DVDShrink to encode video files, whereas if firefox 1.0 hangs up while opening a PDF heads may fly. -Satisfy External Constraints. (meet certain ingrained priorities set by the user, such as if each user has x% of the machine, meeting deadlines like finishing trajectory calculations for a missile before it launches, etc.) -But this does not neccessarily mean that the CPU scheduling must provide "fairness." Strategies in Round Robin (RR) and First Come First Serve (FCFS) are both "fair" but are not similar. RR ensures that no job faces starvation (never gets cpu time) but at the same time, it causes long jobs to take even longer to finish. Similarily, FCFS serves each job as it comes in, but wastes time on long jobs when shorter jobs are stuck behind it. (In FCFS imagine that the CPU is compiling some code and you type in "ls" into the command line. Having "ls" not complete until the compile is done will not make the user happy.) [See Scheduling Algorithms for more info on RR and FCFS] ////////////////////////////////////////////////////////////////////////////////////////// Open Systems Versus Closed Systems ////////////////////////////////////////////////////////////////////////////////////////// infinite -> system -> out finite world -> System --| ^ | Fig 1: open system -------------------------V Fig 2: closed system Open Systems - In an open system the arrival rate is unaffected by the number of jobs that are in the system (the jobs that are in the queue). There are simply an infinite number of jobs that may arrive, and the queue can expand to an infinite size. Closed System - In a closed system the total number of customers (in system and in external world) is constant, so the more that there are in the system the fewer that can be arriving. This model is closer to reality. Throughput - number of jobs completed per second. The faster the stuff gets done, the faster it new jobs can arrive. This means that throughput affects arrival rate, and arrival rate affects throughput in a closed system. Also note that, scheduling only matters in a closed system, since the arrival rate affects the throughput and vice versa. /// p /// p=utilization = (arrival rate)/(service rate) If the arrival rate is always greater than the service rate then the queue will begin to fill to max capacity. When p<1 this means that we can handle all the jobs, and as such, the throughput is not affected by the current arrival rate. //////////////////////// Examples /////////////////////// While the notion that the arrival rate affects throughput is counter-intuitive think about it in this example: [For Closed Systems] Imagine a commute during rush hour. During this commute there are a finite number of drivers who wish to cross the bay bridge. If there is no traffic, the drivers can zip along at 65mph. There aren't many cars on the bridge, which allows the cars to cross the bridge faster (faster throughput) and thus allows more new cars to get onto the bridge (a higher arrival rate). On the other hand imagine that the bay bridge is jammed bumper to bumper. There are more cars on the bridge, but their speed has trickled down to 10mph (slow throughput). Since traffic has slowed fewer new cars can get onto the bridge (slower arrival rate). [How scheduling affects throughput] It is also easy to imagine this in checkout lines at the super-market. The fast lane of 15 items or less may handle fewer items than a normal line, but it also allows the customer to finish faster (faster throughput) that lets more people enter the line (faster arrival rate). Similarily, the normal lines on the other hand handle more items, but then it takes a longer amount of time (slower throughput) and fewer people enter (slower arrival rate since a customer doesn't want to wait in a long line) Imagine then that the super-market got rid of the express lanes, and that the line is comprised of a few people with 200 items and the rest with 10 items or less. If we used FCFS scheduling, then the throughput could be slowed if a customer with many items ends up ahead of customers with fewer items. If throughput slows, then arrival rate slows. In that same line if we insead used RR scheduling, where the cashier rings up 10 items at a time, this would speed up throughput. People with 10 items or less get through the line, and people with many items get 10 ringed up, and are sent to the back of the line with the rest of their groceries. Throughput increases, which increases the arrival rate of new customers, but this would undoubtedly make the customers who have many items very angry. This example should also point out that CPU Sheduling does not necessarily need to be "fair", and that "fair" means different things to different users. For the customers with many items FCFS is fairer and for the people with fewer items RR is fairer. Just think of what people would do if the post office implemented RR rather than FCFS. There would be even more loud and angry people at the back of the line all the time. ////////////////////////////////////////////////////////////////////////////////////////// Job Characteristics ////////////////////////////////////////////////////////////////////////////////////////// -job run times are highly skewed. (lots of short jobs and a few long jobs) Users in general spend a good amount of time reading text, browsing through directories, or basically just thinking about what to do next, rather than constantly hammering the computer to compute things. -The expected time a job has remaining to run is an increasing function of the time it has already run The more we know about the job, the better we can deduce run time and better schedule it. But the only information our scheduler (for today) has to estimate remaining job times is the amount of time the job has already run. So, if a job has already run for a good amount of time, it is more and more likely that it isn't a short job. If it was a short job, it should have been completed by then. So, the more time we run a job, the more probable it is that the job will take even longer to finish. This is a bit similar to finding out whether a person is a procrastinator. Give someone a homework assignment. If they don't finish the homework in the first week, it might take 2 weeks for them to complete. If it isn't done after those two weeks, it might take another month. If there is no deadline to turn it in, the hw may never complete it at all. This concept is described in the function below that is hard to write in a plain text-file: e(t)=integralsign(t to infinity) [ (x-t)r(x)/ (1-R(t)) dx ] where r(t) = probability density function of run time R(t)= cumulative density function of run time Similarily we can also expect the following: -The expected time to the next I/O op is an increasing function of time since the last I/O e.g if the user is typing it is very likely that the user will hit another keyboard key. -The expected time to a page fault is an increasing funct since the last page fault -The expected time for a trap is an increasing function since the last trap ////////////////////////////////////////////////////////////////////////////////////////// Scheduling Algorithms ///////////////////////////////////////////////////////////////////////////////////////// [First Come First Serve (FCFS)] First Come First Serve (FCFS) does what the name implies. The scheduler takes jobs as they come in and complete them one by one. -The simplest case of this is uniprograming, which essentially means that the CPU is limited to doing one thing at a time. Advantages: We schedule a job once, it has the lowest variance of overall flow-time, it's "fair," simple and it has low overhead (no switching). Disadvantages: We must wait until the job is done, regardless of the time it takes to finish it. Process can monopolize CPU, minimal resource sharing (the CPU could be doing other things at the same time, but it doesn't), has a high variance of flowtime/service time - f(i)/s(i) [Round Robin (RR)] A way to improve on FCFS is to limit the maximum amount of time that a process can run without a context switch. We call this a time slice. When a job has exceeded the max run time, it is sent off to the back of the queue. A simple way to implement RR is to run each process for one time slice, before moving the job to the back of the queue. Each prosses gets an equal share of the CPU. Most systems use some variant of this, but the size of the time slice (or time quantum) can play a very importnant role. -If it's too long the process can monopolize the CPU (which becomes very similar to FCFS). If the the time quantum is too large the user can detect the delay when a CPU runs a long job before running a short one. (Originally Unix had 1 sec time slices that were too long, most time sharing systems today use slices around .1 to .5 seconds) -If we make the time quantum too small we run into overhead issues, which also slows down the completion of jobs. -Round Robin can be modified to implement priorities, where different priority processes get different time slices. -Round Robin depends on skewed jobs, if all jobs are similar in length then there is no more advantage in using it. (whereas FIFO would do a better job on consistant jobs, since it has low overhead) Advantages: Allows short jobs to get through pretty quickly, "fair" reasonably simple, low variance of f(i)/s(i). No starvation. (all jobs will eventually get to the CPU). Disadvantages: Long jobs take longer to complete. Relatively high overhead, not as good as some other schemes. [Shortest Job First(SJF)/Shortest Remaining Processing Time (SPT)] Jobs tend to be skewed, to the point where there are a lot of little jobs, and a few long jobs. Our goal is minimizing flow time. Which is equivalent to minimizing the number of users in the system. Little's formula says: N=lambda*F Where: N is the mean (average number of users in the system) F is the mean flow (waiting time) lambda is the arrival rate of user (1/lambda is the mean time between arrivals) also L=lambda*W Here we are always assuming p<1 (the queue is not infinite, the system is not overloaded, and we can eventually complete the jobs that come in.) This implies that to minimize the flow time, we can do something that minimizes the number of users in the system. So, if we knew all the run times, the easist way to get rid of jobs (and thus users), is to complete the shortest ones first SPT scheduling is a way to reduce N. Advantages: Optimal (mimium average flow time) if no preemption, no knowledge of future arrivals (minimizes lowest mean time). Disadvantages: could do better if used preemption - with SJF, new arrivals wait, even if they are short. High variance of flow time - long jobs wait. How we can improve: use preemtion. SRPT: Shortest Remaining Processing Time - Differs from SJF because new arrivals don't always have to wait. If we're running a long job, we can let the short job go through, then go back to the long job. Advantages: this minimizes the average flow time (response time, waiting time). Disadvantages: requires future knowledge overhead due to preemptions (but only one per job) High variance of flow time; long jobs wait. The hardest part about this is that SRPT reuires knowledge of the future. Instead we can use past performance to predict future performance. Here we can take advantage of the fact that if a process has already taken a long time, it will probably take a long time more. This suggests an algorithm called Shortest Elapsed Time (SET), which runs the job that has the shortest amount of run time so far. We can do this by time slicing between equally long jobs Advantages: Permits short jobs to get out fast. Variance of F(i)/s(i) is reasonably low. Disadvantages: high overhead, very bad if run time distribution is not highly skewed, high variance - long jobs wait. [Foreground/Background (FB)] FB scheduling is similar to RR, except that FB implements 2 queues, where the foreground queue has priority over the background queue. Jobs that have run for a while can be put back into the foreground or the background Foreground - Usually we place I/O jobs and new jobs here (remember the less run time a job has had, the more likely it is a short job. If a job is run for enough time it can be placed in the background queue.) Background - Where CPU intensive jobs are placed. As one can see, it processes the short jobs to have better throughput, while ensuring the longer jobs don't face starvation. This combination should provide the user with good performance. [Multilevel Foreground/Background (MLFB)] Multilevel Foreground/Background (MLFB) is similar to FB scheduling except that instead of having only 2 queues of differing priority, MLFB has several. This is very similar to supermarket checkout lines that have 10, 15, or 30 item limit that give customers that meet those restrictions higher priority in the lines, versus those that are in regular lines. Jobs can be assigned at a level based on a lot of criteria: run times, priorities, use of I/O etc. [Exponential Queue - Multilevel Feedback Queue] Taking this one step further, we can give a new process a high priority and a very short time slice. If the job hasn't completed, we decrease priority, and increase the time slice that it is given. The advantage to this is that it reduces switching overhead since longer jobs also probably require more resources. We can also adjust this algrorithm to avoiding starvation by raising the priority of low prioritity jobs according to the time the job has already waited. FB, MLFB and MFQ are all examples of adaptive scheduling algorithms, algorithms that change the priority of the job depending on its run time behaiviors.