Heng Woon Ong

Lecture 03/02/05

PAGING ALGORITHMS


How to evaluate paging algorithms:
- Two approaches (Metrics) for Eval Paging Algorithms:
	::- Curve of page faults  vs.  amount  of  space  used  preferable.  
		(Called "parachor curve")
		
		| |
		| |
 		| \
	faults	|  |
		|  \
		|   \
 		|    \
 		|     ----------
		--------------------
			space


	::- Space time product vs.  amount  of  space.   Want  to minimize STP.  
		
		| |
		| |
		| |
	STP	| \         --/
		|  |      -/
		|   \    -/
		|    ---/
		|
		---------------------
			space

	::- Space time product (STP)- integral of  amount  of space used by 
	program over the time it runs.  Includes time for page faults.  This  is
	the  real space time product.

	::- Exact formula is integral(0,E(space)) [m(t)  dt], where  E  is  
	ending time of program, and m(t) is memory used by program at time t 
	(real time).

	::- In discrete time, {sum (from i=0 to R) [m(i)(1+f(i)*PFT)]},  where  
		- R is ending time of program in discrete time (i.e. number  of  
		memory references),  
		- i is i'th memory reference 
		- m(i) is number of pages in memory at i'th reference, 
		- f(i) is  indicator  function = 0 if no page fault, =1) if page fault.  
		- PFT = page fault time.
	::- Disadvantages in formulation : Variable PFT since PFT is technologically 
	dependent 

	::- Figuring out page faults for the 3 different algorithms:
		Memory references 4 3 2 1 4 3 5 4 3 2 1 5
	
	* denotes page fault.	
	
	LRU:
	*4  *3  *2  *1  4  3  *5  4  3  *2  *1  *5
	     4   3   2  1  4   3  5  4   3   2   1
	         4   3  2  1   4  3  5   4   3   2
	             4  3  2   1  1  1   5   4   3

	
	FIFO:
	*4  4  4  4  4  4  *5  5  5  5  *1  1
	   *3  3  3  3  3  3  *4  4  4  4  *5
	      *2  2  2  2  2  2  *3  3  3   3
                 *1  1  1  1  1  1  *2  2   2


	OPT: 4 frames OPT is same as min. Remove the page that will not be referenced for
		longest time in the future.
	
	*4  *3  *2  *1  4  3  *5  4  3  *2  *1  5
	     4   4   4  1  4   4  5  5   5   5  1
	         3   3  3  1   3  3  4   3   2  2
                     2  2  2   2  2  2   4   3  3

	Number of page faults per algorithm:

        Memory size
        | 3  |  4 |
	-----------
   LRU  | 10 |  8 |
   FIFO |  9 | 10 |
   OPT  |  7 |  6 |
        -----------

	Q: will LRU scheme have less page faults if there is less memory?
	A: No.


- Stack Algorithm - An algorithm which obeys the inclusion property  -  the set of pages in
  a memory of size N at time t is always a subset of the set of pages in a memory of  size  
  N+1 at  time t.  Obviously cannot have miss ratio increasing with memory size.


- Implementing LRU:  need some form of hardware support in ord er to keep track of 
  which pages have been used recently.

	::- Perfect LRU?  Keep a register for each  page,  and  store
             the system clock into that register on each memory refer-
             ence.  To replace a page, scan through  all  of  them  to
             find the one with the oldest clock.  

	::- Or, could use linked list to maintain "LRU stack".   Note
 	that we can see (by inspection) that with LRU, miss ratio
 	with never increase with increasing number  of  pages  in
	memory.
	
	::- Perfect LRU painful to implement. Most people use an approximation of LRU.


- use bit (reference bit) - a bit in the page table entry (usually  cached in the TLB), 
  that is set when the page is referenced.  It is turned off under OS control.


- Clock algorithm:  keep  ``use''  bit  for  each  page  frame, hardware sets the bit
  for the referenced page on every memory reference.  Have a pointer pointing to the 
  k'th  page  frame. When  a  fault  occurs, look at the use bit of the page being
  pointed to.  If it is on, turn it off, increment the pointer, and  repeat.   If  it
  is  off, replace the page in that page frame, set use(k)=1.  (Clock diagram.)

        ::- Also called FINUFO - first in, not used, first out.


         ---- : first reference
         ==== : second reference
			o

                   0         o
	               	  /	
	                 /
			/			
                 o      ======>  0


                   o         o		
			0


	::- suppose we use LFU (Least frequently used). WIll that be good or bad?
		- BAD! Locality changes, so it will be disastrous.
		- imagine a buffer pool of memory, but the memory is paged. 


- Per process replacement: A per process replacement algorithm or local page replacement
  algorithm,  or  per  job replacement algorithm allocates page frames to individual 
  processes:  a page fault in one  process can  only replace one of that process' frames.
  This relieves interference from other processes.

- If all pages from all processes are lumped  together  by  the replacement  algorithm,
  then  it  is said to be a global replacement algorithm.  Under this scheme,  each
  process  competes with all of the other processes for page frames.

         ::- Local algorithm:
            - Protects jobs from others which are badly behaved.
            - Hard to decide how much space  to  allocate  to  each process.
	    - Allocation may be unreasonable.

         ::- Global algorithm:
             - Permits memory allocation for process to  shift  over 
                 time.
             - Permits memory allocation to adapt to process needs
             - Permits  badly  behaved  process  to  grab  too  much memory.

	Q: What is used nowadays? 
	A: Usually these days, a global algorithm is used.


- Thrashing:  A situation when the page fault rate is  so  high that  the  system spends
  most of its time either processing a page fault or waiting for a page to arrive.

         ::- means that there is too much page fetch idle time when the processor 
	     is idle waiting for a page to arrive.

         ::- Suppose there are many users, and that between them their processes are 
             making frequent references to 50 pages, but memory has 40 pages.

         ::- Each time one page is brought  in,  another  page,  whose contents will soon
             be referenced, is thrown out.

         ::- The system will spend all of its time reading and writing pages.  It will 
             be working very hard but not getting any thing done.

             -   The progress of the programs will make it  look  like
                 the  access time of memory is as slow as disk, rather
                 than disks being as fast as memory.


     	::- Thrashing occurs because the system doesn't know when it  has
            taken  on more work than it can handle.  LRU mechanisms order
            pages in terms  of  last  access,  but  don't  give  absolute
            numbers indicating pages that mustn't be thrown out.

         	- What do  humans  do  when  thrashing?   If  flunking  all
             	courses at midterm time, drop one.

     	::- Solutions to Thrashing:

         	- If a single process is too large  for  memory,  there  is
		nothing  the OS can do.
             	(Buy more memory)

         	- If the problem arises  because  of  the  sum  of  several 
                  processes:

             		-   Figure  out  how  much  memory  each  process  needs.
                 	Change  scheduling  priorities  to  run  processes in
                 	groups whose memory needs can be satisfied.

             		-   Shed load.

             		-   Change paging algorithm


- Working Set Paging:
         ::- Working set = ``the set of pages that a process is  work-
             ing  with, and which must thus be resident if the process
             is to avoid thrashing.''

         ::- Formally,  ``Exactly  that  set  of  pages  used  in  the
             preceeding T virtual time units'' (T usually given in un-
             its of memory references.)

         ::- Choose T, the working set parameter.  At any given  time,
             all  pages  referenced by a process in its last T seconds
             of execution are considered to comprise its working set.

         ::- Working Set Paging  Algorithm  keeps  in  memory  exactly
             those pages used in the preceding T time units.

             - Minimum values for T are  about  10,000  to  100,000
                 memory references.

         ::- A process will never be executed unless its  working  set
             is  resident  in  main memory.  Pages outside the working
             set may be discarded at any time.

             - Note that this requires  a  reservoir  of  unassigned
                 page frames.


     	::- Working set paging requires that the sum of the sizes of  the
         working  sets of the jobs eligible to run (which we will call
         the balance set) be less than or equal to the amount of space
         available.   We previously referred to the balance set as the
         jobs in the in-memory queue.
		- What happens if the balance set changes too frequently?

             		+   Still get thrashing


     	::- How different is Working set paging than LRU?
		- # of page allocations in working set is variable.


     	::- How do we implement working set?  Can it be done exactly?
	
		- take advantage of use bits.

             	- OS maintains idle time value for each  page:   amount
                 of  CPU time received by process since last access to
                 page.

                - Every once in a while, scan all pages of  a  process.
                 For each use bit on, clear page's idle time.  For use
                 bit off, add process' CPU time (since last  scan)  to
                 idle time.  Turn all use bits off during scan.

                - Scans happen on order of every few seconds (in  Unix,
                 ?? is on the order of a minute or more).


- Working Set Restoration
         ::- Idea is that when we remove a process from the  in-memory
             queue, we know what its working set is.

         ::- When we run the process again (i.e.  promote  it  to  the
             in-memory  queue),  we  can  restore  the  working set to
             memory all at once.l

             -   Advantages:
                 +   minimize CPU overhead

                 +   don't have to wait for each  page  fault  ->  all
                     transfers at once.

                 +   Can optimize layout when  writing  out,  and  can
                     fetch from consecutive locations

                 +   Or can just sort the fetches, so that average la-
                     tency is much smaller.

	::- Basically copy out to disk when not using, and when need to use
	    copy back into memory.


- Page Fault Frequency - Let X be the  virtual  time  since the last page 
  fault for this process.

      	::- At the time of a page  fault,  [If  X>T,  remove  all
            pages  (of  the process) with the use bit off.]  Then
            get a page frame for the new page, and turn  off  all
            reference bits for the process.

	::- Idea was to make this a quick and easy way  to  implement
            working set.  Idea is that as long as process is faulting
            too often (<T), process will get  more  pages  and  cease
             faulting so frequently.

        ::- Problem is that process can fault  frequently  and  still
             not  need  more  pages  - may be going through large data
             area.  Doesn't work as well as WS.

	::- Advantages over working set: Easier to implement
	
	::- Disadvantage: Can be unstable. It will use a lot of memory if constantly
	    page faulting.


No lecture on Monday. /cry