Agusno Lie, HW8 cs162-bw SID # 17375648 Lecture notes of Wednesday, January 26 2005. Announcements: - CSUA is having an Intermediate UNIX help session on Wednesday January 26 at 306 Soda on 7 pm. - The CS162 Reader will have two parts. First part containing the Nachosi program is already available. The 2nd part contains the readings, and will be available on Thursday, January 27. Buy the readers at Copy Central on Hearst near the North Gate. - Please buy the textbook before it disappears from the bookstore. - Read the handouts before asking anything about the course. - Please sign up for lecture notes if you haven't. - Section 103 will be moved to 5-6 pm at 320 Soda, at least for the next 2 weeks. Lecture Goals for scheduling disciplines: (What is the thing that should be optimized?) - Efficiency of resource utilization, which means keeping the CPU and disks as busy as possible. - Minimizing overhead, which is minimizing context swaps. - Minimizing average response time. Response time: time between you start asking the system to do something and the time it responds back to you. - Maximizing throughput. Throughput: jobs per second. Will be defined more later, on the extend of a scheduling algorithm affecting it. - Under certain circumstances only, is providing guaranteed fraction of machine to specific users or set of users. In some situation, certain groups may pay for a certain fraction of the computer system and expect to get that fraction for their own use. - Minimizing number of waiting processes or interactive users. (The two are not necessarily the same) - Get the short jobs to finish quickly. - Avoiding starvation. Starvation occurs when a process gets very little service, waiting again and again while other processes get done. Rho is the load on the system, the property of system. if rho > 1, then the load is greater than the system can accomplish, which means that there will be jobs that are never going to get serviced. For rho < 1, all jobs will get serviced but might take a really long time (humorously, we can call it hunger). For the case of ? = rho, it is an unstable situation. In practice, things are never actually equal, so rho = 1 is not common. - Minimizing variance of response time. This means that we want to make runtime predictable. Will be discussed more later. - Satisfy whatever external constraints we have, such as who are paying for it, the importance of the job. - Meet the deadline. - Graceful degradation under overload. An example of this is the on-ramp lamps on freeways that control vehicle flow for avoiding gridlock. - Fairness (this was the one thing the professor was looking for at the end of Monday's (01/24) lecture). The problem with fairness is that it's a psychological concept which the computer doesn't have. FIFO and RR are fair, but they don't have anything to do with each other. That's why we can't optimize fairness. The real goal of the scheduling algorithm is a psychological factor, which is to make the user happy. This is the ultimate goal of any scheduling algorithm. But this isn't what we're trying to optimize *this is not a psychology class, unfortunately*. What we usually do instead of giving questionnaires to users, are two criteria: - Minimum flow time. Flow time: time for the job to flow through the system, from the time it arrives until it's finished the service. We want to minimize the average flow time over all the jobs (Minimize the average f(i)). - Minimize variance of flow time over service time (minimize f(i)/s(i)). Service time: the amount of processing that a job requires. Generally speaking, we should try to have flow time and service time proportional. Short jobs should go through quickly, and long jobs should be expected to take longer. This also reflects the psychological aspect of the fact that people submitting a lot of work expects a long time, and a short time for few works. Q: Is flow time quantized? A: Not necessarily because service time may not be. The switching might be quantized, but that doesn't mean that the job has to finish at the end of a quantum. Typical external constraints: - Fixed priorities: * Administratively. * By purchase (e.g. toll lanes). We can pay n-times of the CPU going rate and expect n-times the priority. For example, overnight jobs should have low priorities while immediate jobs should have high priorities. - Fixed fraction of the machines. If a department has paid for 25% of a computing installation, they can expect 25% of the CPU cycle. - Real time constraint. Not the clock on CPU, but the clock on the wall. Seeing things more realistically. For example, shooting down a missile before it reaches the base. Q: How do we measure? A: We measure it after the jobs are done by dividing flow time and service time. We generally don't know what service time is because it depends on the state of the system, even though there are going to be cases when we are discussing about knowing service time in advance. We're trying to minimize the functions, so service time will depend on the scheduling algorithm too. Q: Do we need to know f(i) in advance in real time? A: No. We're trying to optimize these values. We don't necessarily know f(i) or s(i) on the beginning, and we're trying to figure out a good scheduling algorithm to minimize them. The thing about scheduling is that we can analyze some simple cases. As soon as we start to work on a real computer system, we have more aspects that could affect the I/O times. The ones being talked right now is very simplified version of the real ones. Studying scheduling algorithms is like studying step functions. The really simple ones are okay, but the really hard ones are impossible to analyze. Open vs Closed System. lambda µ ----> System ----> Open System ----> World ---- | | Closed System | | --- System <----- µ = service rate 1 / µ = mean service time rho = lambda / µ lambda = arrival rate 1 / lambda = mean arrival time In an Open System, the arrival rate is not affected by anything that's going on the system. While it might sound abstract, the Closed System, the amount of arrival is affected by the system. If everyone in the world is waiting for a response, then the arrival time in Closed System is zero. In Closed System, the amount of customers in the system and external world is constant, so the more there are in the system, the fewer that can be arriving. As we are talking about scheduling, we know that scheduling affects the system. If we change the scheduling in the system, how does it affect the throughput? It does not affect throughput at all in an Open System. If rho < 1, every job that arrives will finish. Scheduling affects the queuing time and how long it takes for the job to finish, but does not affects the throughput. It does affect the throughput in a Closed System, because the longer time a job spent in the World, means that less time is spent in the System, thus affecting the arrival rate. Bad scheduling algorithm in a Closed System affects the throughput. Be careful on what we're doing. Q: Does overhead affect throughput? A: Not really. As long as rho < 1, it doesn't affect throughput. Overhead affects flow time, not throughput. More info about rho rho is the offered load, and represents how much is being demanded of the system. If it's less than one, the system will manage to finish all the jobs, while if it's bigger than one, more jobs are present than what the system is able to accomplish. Q: Does scheduling algorithm affect µ? A: Assuming no overhead in a system, it doesn't. As soon as we have overhead, we lose control to analyze mathematically. Q: Example of Open System? A: DMV, but not 100% correct. In practice, the number on the line is only a fraction of the whole California population and does not affect the number of people arriving to the office. This is assuming that no one on the line stops queuing if the line's not moving or changed their mine when entering and seeing a long line. Q: Example of Closed System? A: Waiting in our own computer system and doing nothing while waiting for a response from the computer, while altogether sharing the system with other people. User characteristics - Users are willing to wait in proportion to how long a job would take. A 10,000 line program will be expected to compile quite a while, while doing something simple like 'ls' on UNIX should have an immediate response. - Users like predictability - User waiting at a terminal will get impatient, and so there is a break point on how long a user will wait. What is the simplest scheduling algorithm? FIFO (First In First Out) / FCFS (First Come First Serve) Jobs get in line in the order of when they arrive, and processed according to it. In the simplest case, it's called uniprogramming. - Advantages: fair and simple, very low overhead (no task switching), has the lowest variants of the overall flow time f(i). - Disadvantages: one process can monopolize the CPU, short jobs wait for long jobs, there's very little resource sharing, provides high variance of flow time over service time f(i)/s(i) (doesn't matter how big the work is, we get to it at the same time). Are there such things as long jobs and short jobs? While this might sound ridiculous, it's not. Job run times are highly skewed, ************ To see the curve, please refer to http://www.geocities.com/agusnolie/cs162.jpg *************** Based on the left curve, there are lots of short jobs, few long jobs, and fewer really long jobs. In particular, the expected time of completion gets really high as the time it has spent so far increases. If a job is already taking a long time, we probably expect it to finish even longer. The thing is that we don't know when a job will finish, and we only know how long the job has take. Eventually a job will finish, but the longer a job has taken time, we will expect it to be finished even longer after it. Formula for E(t) (expected time to completion) E(t) = integral from t to infinity of (x-t)r(t)/1-R(t) over dx r(t) is the run time distribution. R(t) is the cumulative run time distribution. These graphs are the basis on what determines a good scheduling algorithm. The expected time that a job has remaining to run is an increasing function of the time it has already run. The expected time to the next I/O is an increasing function of the time since the last I/O. The expected time to a page fault is an increasing function of the time since the last page fault. The expected time to a trap or interrupt is an increasing function of the time since the last trap or interrupt. We don't know how long a job would take to eventually finish, so the idea is to limit a processing a job and then switch to another job. This is called a time slice. In Round Robin scheduling algorithm, we run a process for one time slice, and then move to another process waiting in the queue. Most systems use some variant of Round Robin. Some problems with ineffective time slice: - Too long: the process becomes FCFS because one really long job can monopolize the CPU. - Too short: The overhead kills you because there's too much switching. Properties of Round Robin: - Advantages: short jobs are going to finish quickly because they will be processed in some time, while the long jobs will stay a while. In a sense, it's "fair". It's reasonably simple, and there's low variance of f(i)/s(i). There's no starvation. - Disadvantages: relatively high overhead due to the switching, and it's not as good as the other algorithms. Originally, Unix had 1 sec. time slices, which is too long. Most timesharing systems today use time slices of around .1 to .5 seconds. Round Robin works well because run times are highly skewed. If it's now skewed, Round Robin wouldn't work well. Example: 10 jobs with 100 time units each with 10 as the time slice. The average flow time would probably be 950. But if we use FCFS, the average flow time would be around 550. In this case, FCFS has a higher variance of both f(i) and f(i)/s(i). Round Robin can be implemented for priorities. It can give different priority processes different time slices. Q: Is there any way to make predictions on how long a job would take? A: Depends on how many information you have. We're not doing database problems because in database queries there's very low variation of jobs, and prediction is easier. This doesn't apply because we have to know exact information about the job. Typically in super computers the information is quite detail. Q: Any computers that can switch scheduling algorithms at anytime? A: Not to the professor's knowledge. Usually a computer has a piece of code with a complicated variant scheduling algorithm of the ones we're talking in this class. It can't be as simple as the one we're talking because there are more factors that affects the scheduling in real life. Remember, we're only talking about simplified systems. Q: In Round Robin, does the amount of time slice depends on how many jobs are there in the queue? A: For the ones we're talking right now, it's fixed. We can base the time slice on many factors, such as queue length, job size, or deadlines. Any of these complexities can be added, but the basic Round Robin uses a fixed time slice. Q: How should we base the time slice on the project? A: You should pick a quantum so that really short jobs will get through in one cycle. Remember, we're trying to minimize the flow time. This is basically the same thing as minimizing the average number of users in the system. The reason is a formula called Little's Formula, proven by J.D.C. Little. N = lambda * F or L = lambda * W assuming rho < 1. If it's not less than one, queue length goes to infinity and is overloaded. N / L = average number of users F = average flow time W = waiting time, or flow time excluding the service time lambda = arrival rate of N What does this suggest as an approach to scheduling? If we want to minimize flow time, we want to minimize the number of people waiting. And the way to do so is to finish the jobs that end quickly. Shortest Job First is basically an FCFS algorithm with a sorted queue with the shortest service time in the front of the queue. This is the fastest way to get users out of the system, and reduce N. - Advantages: Optimal by minimizing average flow time, under certain circumstances, specifically no preemption and no knowledge of future arrivals. - Disadvantages: Requires knowledge of the future. We could do much better if we used preemption because if a very long job is being processed, an arriving short job would still have to wait for it to finish. Also there's a high variance of flow time because the long jobs wait. We can do better with Shortest Remaining Processing Time because it uses preemption. A longer job can be interrupted to process a shorter job first to optimize the system. - Advantages: This minimizes the average flow time. - Disadvantages: Also requires future knowledge, and there is overhead due to task switching. The preemption required will not be more than the number of jobs. Still a high variance of flow time because the long jobs still wait. Q: What future knowledge do we need? A: We would like to know how long a job would take to run. Q: Can we use the graph of expected time vs elapsed time to predict? A: It's only an approximation, we still don't know how long it will take. Q: What's a preemption? A: In general, preemption is when we stop a process that is running because there's another job that we want to process. Because a long elapsed time for a job would probably take even longer to finish, this suggests the Shortest Elapsed Time algorithm. This will run the job with the shortest elapsed time so far. Also uses a quantum so that there's a minimum time a job will be processed before any switching is occurred. It's rather "similar" to Round Robin, with some changes on the switching techniques. - Advantages: Permits short jobs to get out rather fast, and variance of f(i)/s(i) is reasonably low. - Disadvantages: Provides a high overhead if we're not careful about limiting how often we switch. If the run time distribution is not skewed, it's going to be very bad. High variance of f(i).