Agusno Lie, HW8
cs162-bw
SID # 17375648
Lecture notes of Wednesday, January 26 2005.

Announcements:
- CSUA is having an Intermediate UNIX help session on Wednesday January 26 at
  306 Soda on 7 pm.
- The CS162 Reader will have two parts. First part containing the Nachosi
  program is already available. The 2nd part contains the readings, and will
  be available on Thursday, January 27. Buy the readers at Copy Central on 
  Hearst near the North Gate.
- Please buy the textbook before it disappears from the bookstore.
- Read the handouts before asking anything about the course.
- Please sign up for lecture notes if you haven't.
- Section 103 will be moved to 5-6 pm at 320 Soda, at least for the next 2
  weeks.

Lecture

Goals for scheduling disciplines: (What is the thing that should be optimized?)
- Efficiency of resource utilization, which means keeping the CPU and disks as
  busy as possible.
- Minimizing overhead, which is minimizing context swaps.
- Minimizing average response time. Response time: time between you start
  asking the system to do something and the time it responds back to you.
- Maximizing throughput. Throughput: jobs per second. Will be defined more
  later, on the extend of a scheduling algorithm affecting it.
- Under certain circumstances only, is providing guaranteed fraction of machine
  to specific users or set of users. In some situation, certain groups may pay
  for a certain fraction of the computer system and expect to get that fraction
  for their own use.
- Minimizing number of waiting processes or interactive users. (The two are not
  necessarily the same)
- Get the short jobs to finish quickly.
- Avoiding starvation. Starvation occurs when a process gets very little 
  service, waiting again and again while other processes get done. Rho is the 
  load on the system, the property of system. if rho > 1, then the load is 
  greater than the system can accomplish, which means that there will be jobs 
  that are never going to get serviced. For rho < 1, all jobs will get serviced
  but might take a really long time (humorously, we can call it hunger). For 
  the case of ? = rho, it is an unstable situation. In practice, things are 
  never actually equal, so rho = 1 is not common. 
- Minimizing variance of response time. This means that we want to make runtime
  predictable. Will be discussed more later.
- Satisfy whatever external constraints we have, such as who are paying for it,
  the importance of the job.
- Meet the deadline. 
- Graceful degradation under overload. An example of this is the on-ramp lamps
  on freeways that control vehicle flow for avoiding gridlock.
- Fairness (this was the one thing the professor was looking for at the end of
  Monday's (01/24) lecture). The problem with fairness is that it's a 
  psychological concept which the computer doesn't have. FIFO and RR are fair, 
  but they don't have anything to do with each other. That's why we can't 
  optimize fairness.

The real goal of the scheduling algorithm is a psychological factor, which is
to make the user happy. This is the ultimate goal of any scheduling algorithm.
But this isn't what we're trying to optimize *this is not a psychology class,
unfortunately*. 

What we usually do instead of giving questionnaires to users, are two criteria:
- Minimum flow time. Flow time: time for the job to flow through the system, 
  from the time it arrives until it's finished the service. We want to minimize
  the average flow time over all the jobs (Minimize the average f(i)).
- Minimize variance of flow time over service time (minimize f(i)/s(i)). 
  Service time: the amount of processing that a job requires. Generally 
  speaking, we should try to have flow time and service time proportional. 
  Short jobs should go through quickly, and long jobs should be expected to 
  take longer. This also reflects the psychological aspect of the fact that 
  people submitting a lot of work expects a long time, and a short time for 
  few works.

Q: Is flow time quantized?
A: Not necessarily because service time may not be. The switching might be 
   quantized, but that doesn't mean that the job has to finish at the end of a
   quantum.

Typical external constraints:
- Fixed priorities:
  * Administratively.
  * By purchase (e.g. toll lanes). We can pay n-times of the CPU going rate and 
    expect n-times the priority. For example, overnight jobs should have low 
    priorities while immediate jobs should have high priorities.
- Fixed fraction of the machines. If a department has paid for 25% of a 
  computing installation, they can expect 25% of the CPU cycle.
- Real time constraint. Not the clock on CPU, but the clock on the wall. Seeing
  things more realistically. For example, shooting down a missile before it 
  reaches the base.

Q: How do we measure?
A: We measure it after the jobs are done by dividing flow time and service 
   time.  We generally don't know what service time is because it depends on
   the state of the system, even though there are going to be cases when we are
   discussing about knowing service time in advance. We're trying to minimize
   the functions, so service time will depend on the scheduling algorithm too. 

Q: Do we need to know f(i) in advance in real time?
A: No. We're trying to optimize these values. We don't necessarily know f(i) or
   s(i) on the beginning, and we're trying to figure out a good scheduling 
   algorithm to minimize them.

The thing about scheduling is that we can analyze some simple cases. As soon as
we start to work on a real computer system, we have more aspects that could 
affect the I/O times. The ones being talked right now is very simplified version
of the real ones. Studying scheduling algorithms is like studying step
functions. The really simple ones are okay, but the really hard ones are
impossible to analyze.

Open vs Closed System.

lambda            ľ            
----> System ---->		Open System

----> World ----
|		   |		Closed System
|		   |
--- System <-----

ľ = service rate	1 / ľ = mean service time		
rho = lambda / ľ
lambda = arrival rate	1 / lambda = mean arrival time

In an Open System, the arrival rate is not affected by anything that's going
on the system. While it might sound abstract, the Closed System, the amount of 
arrival is affected by the system. If everyone in the world is waiting for a
response, then the arrival time in Closed System is zero. In Closed System, the
amount of customers in the system and external world is constant, so the more
there are in the system, the fewer that can be arriving.
As we are talking about scheduling, we know that scheduling affects the system.
If we change the scheduling in the system, how does it affect the throughput? 
It does not affect throughput at all in an Open System. If rho < 1, every job
that arrives will finish. Scheduling affects the queuing time and how long it
takes for the job to finish, but does not affects the throughput.  It does
affect the throughput in a Closed System, because the longer time a job spent
in the World, means that less time is spent in the System, thus affecting the
arrival rate. Bad scheduling algorithm in a Closed System affects the
throughput. Be careful on what we're doing.

Q: Does overhead affect throughput?
A: Not really. As long as rho < 1, it doesn't affect throughput. Overhead
affects flow time, not throughput.

More info about rho
rho is the offered load, and represents how much is being demanded of the
system. If it's less than one, the system will manage to finish all the jobs,
while if it's bigger than one, more jobs are present than what the system is
able to accomplish.

Q: Does scheduling algorithm affect ľ?
A: Assuming no overhead in a system, it doesn't. As soon as we have overhead,
we lose control to analyze mathematically. 

Q: Example of Open System?
A: DMV, but not 100% correct. In practice, the number on the line is only a
fraction of the whole California population and does not affect the number of
people arriving to the office. This is assuming that no one on the line stops
queuing if the line's not moving or changed their mine when entering and seeing
a long line.

Q: Example of Closed System?
A: Waiting in our own computer system and doing nothing while waiting for a
response from the computer, while altogether sharing the system with other
people.

User characteristics
- Users are willing to wait in proportion to how long a job would take. A
  10,000 line program will be expected to compile quite a while, while doing
  something simple like 'ls' on UNIX should have an immediate response. 
- Users like predictability
- User waiting at a terminal will get impatient, and so there is a break point
  on how long a user will wait.


What is the simplest scheduling algorithm?
FIFO (First In First Out) / FCFS (First Come First Serve)
Jobs get in line in the order of when they arrive, and processed according to
it. In the simplest case, it's called uniprogramming. 
- Advantages: fair and simple, very low overhead (no task switching), has the
  lowest variants of the overall flow time f(i).
- Disadvantages: one process can monopolize the CPU, short jobs wait for long
  jobs, there's very little resource sharing, provides high variance of flow
  time over service time f(i)/s(i) (doesn't matter how big the work is, we get
  to it at the same time).

Are there such things as long jobs and short jobs? While this might sound
ridiculous, it's not. Job run times are highly skewed, 

************ To see the curve, please refer to 
             http://www.geocities.com/agusnolie/cs162.jpg ***************

Based on the left curve, there are lots of short jobs, few long jobs, and fewer
really long jobs. In particular, the expected time of completion gets really
high as the time it has spent so far increases. If a job is already taking a
long time, we probably expect it to finish even longer. The thing is that we
don't know when a job will finish, and we only know how long the job has take.
Eventually a job will finish, but the longer a job has taken time, we will
expect it to be finished even longer after it.

Formula for E(t) (expected time to completion)
E(t) = integral from t to infinity of (x-t)r(t)/1-R(t) over dx
r(t) is the run time distribution.
R(t) is the cumulative run time distribution.

These graphs are the basis on what determines a good scheduling algorithm.

The expected time that a job has remaining to run is an increasing function of 
the time it has already run.
The expected time to the next I/O is an increasing function of the time since
the last I/O.
The expected time to a page fault is an increasing function of the time since
the last page fault.
The expected time to a trap or interrupt is an increasing function of the time
since the last trap or interrupt.

We don't know how long a job would take to eventually finish, so the idea is to
limit a processing a job and then switch to another job. This is called a time
slice.

In Round Robin scheduling algorithm, we run a process for one time slice, and
then move to another process waiting in the queue. Most systems use some variant
of Round Robin.

Some problems with ineffective time slice:
- Too long: the process becomes FCFS because one really long job can monopolize
  the CPU.
- Too short: The overhead kills you because there's too much switching.
Properties of Round Robin:
- Advantages: short jobs are going to finish quickly because they will be 
  processed in some time, while the long jobs will stay a while. In a sense, 
  it's "fair". It's reasonably simple, and there's low variance of f(i)/s(i). 
  There's no starvation.
- Disadvantages: relatively high overhead due to the switching, and it's not as
  good as the other algorithms. 

Originally, Unix had 1 sec. time slices, which is too long. Most timesharing 
systems today use time slices of around .1 to .5 seconds.

Round Robin works well because run times are highly skewed. If it's now skewed,
Round Robin wouldn't work well. 

Example:
10 jobs with 100 time units each with 10 as the time slice. The average flow 
time would probably be 950. But if we use FCFS, the average flow time would be 
around 550. In this case, FCFS has a higher variance of both f(i) and f(i)/s(i).

Round Robin can be implemented for priorities. It can give different priority 
processes different time slices.

Q: Is there any way to make predictions on how long a job would take?
A: Depends on how many information you have. We're not doing database problems 
because in database queries there's very low variation of jobs, and prediction
is easier. This doesn't apply because we have to know exact information about
the job. Typically in super computers the information is quite detail. 


Q: Any computers that can switch scheduling algorithms at anytime?
A: Not to the professor's knowledge. Usually a computer has a piece of code
with a complicated variant scheduling algorithm of the ones we're talking in
this class. It can't be as simple as the one we're talking because there are
more factors that affects the scheduling in real life. Remember, we're only talking about simplified systems.

Q: In Round Robin, does the amount of time slice depends on how many jobs are
there in the queue?
A: For the ones we're talking right now, it's fixed. We can base the time slice
on many factors, such as queue length, job size, or deadlines. Any of these
complexities can be added, but the basic Round Robin uses a fixed time slice.

Q: How should we base the time slice on the project?
A: You should pick a quantum so that really short jobs will get through in one
cycle.

Remember, we're trying to minimize the flow time. This is basically the same
thing as minimizing the average number of users in the system. The reason is a
formula called Little's Formula, proven by J.D.C. Little.

N = lambda * F  		or 		L = lambda * W
assuming rho < 1. If it's not less than one, queue length goes to infinity and is overloaded.

N / L = average number of users
F 	= average flow time
W	= waiting time, or flow time excluding the service time
lambda   	= arrival rate of N

What does this suggest as an approach to scheduling? If we want to minimize
flow time, we want to minimize the number of people waiting. And the way to do
so is to finish the jobs that end quickly.

Shortest Job First is basically an FCFS algorithm with a sorted queue with the
shortest service time in the front of the queue. This is the fastest way to get
users out of the system, and reduce N.
- Advantages: Optimal by minimizing average flow time, under certain
circumstances, specifically no preemption and no knowledge of future arrivals.
- Disadvantages: Requires knowledge of the future. We could do much better if
we used preemption because if a very long job is being processed, an arriving
short job would still have to wait for it to finish. Also there's a high
variance of flow time because the long jobs wait.

We can do better with Shortest Remaining Processing Time because it uses
preemption. A longer job can be interrupted to process a shorter job first to
optimize the system.
- Advantages: This minimizes the average flow time.
- Disadvantages: Also requires future knowledge, and there is overhead due to
task switching. The preemption required will not be more than the number of
jobs. Still a high variance of flow time because the long jobs still wait.

Q: What future knowledge do we need?
A: We would like to know how long a job would take to run.

Q: Can we use the graph of expected time vs elapsed time to predict?
A: It's only an approximation, we still don't know how long it will take.

Q: What's a preemption?
A: In general, preemption is when we stop a process that is running because
there's another job that we want to process.

Because a long elapsed time for a job would probably take even longer to
finish, this suggests the Shortest Elapsed Time algorithm. This will run the
job with the shortest elapsed time so far. Also uses a quantum so that there's
a minimum time a job will be processed before any switching is occurred. It's
rather "similar" to Round Robin, with some changes on the switching techniques.
- Advantages: Permits short jobs to get out rather fast, and variance of
  f(i)/s(i) is reasonably low.
- Disadvantages: Provides a high overhead if we're not careful about limiting
  how often we switch. If the run time distribution is not skewed, it's going
  to be very bad. High variance of f(i).