Johnny Zhou
Lecture 2/28/07

TOPIC:  DEMAND PAGING, THRASHING, WORKING SETS (BEGINNING OF DISCUSSION OF SPECIFIC ALGORITHMS FOR MANAGING MEMORY
==================================================================================================================


** History
      -Memory used to be $1 million per MB
      -Now about 10 cents
      -Reasoning behind these algorithms was cost of memory
         -Needed to maximize efficiency
      -Wasteful for a process to be completely loaded into memory before running
         -Virtual memory permits a process to run with only some of its virtual address space loaded into physical memory
         -Storage heirarchy:         
             -CPU --> Cache --> Main memory --> Disk --> Tape/Network
             -Increasing size, but slower
             -Cache < 10 ns
             -CPU 1/3 ns
             -Memory 667 Hz
             -Disk 5-50 ms
             -Tape, network (extremely slow)
         -Each level has the illusion that it has the address space of the object below it
      -Need to produce the illusion that VA is in main memory
         -Review: Principle of Locality
             -Temporal locality: recently used info likely to be used again soon
             -Spatial locality: nearby information likely to be used again soon
      -If the process is not entirely loaded, hardware and software cooperate to make this work
         -Using valid bits, reference bit, protection bits, dirty bit to manage interaction

** If page not found in main memory, the following happens:
      -Trap to OS
      -Verify that reference is to a valid page, otherwise abnormal-end (abend)
      -Find page frame to put page
          -If no empty frame, choose a page to replace (Will discuss replacement methods later)
          -If dirty, write to disk
          -Remove page
          -Update page table
          -Update map of secondary storage if necessary
          -Update memory (core) map
          -Flush TLB entry for page that has been removed
      -OS brings page into memory
          -Find page on secondary storage (Hard disk, tape)
          -Transfer it to memory
          -Update page table (set valid, real address)
          -Update map to show that page is now in memory
          -Update Core Map
          -Process resumes execution
          -Note: Takes long, transfer can be preempted

** Multiprogramming is supposed to overlap the fetch of a page for one process with the execution of another
      -If no other process available to run, called multiprogramming/page fetch idle
      -@Page out: to remove a page
      -@Page out/in a process: load or remove from memory
      -Resuming process is very tricky, since page fault may have occurred in the middle of an instruction.  
      -Don’t want user process to be aware that the page fault even happened
      -Can the instruction just be skipped?  Is it legal, correct?
      -Example: mvc string1, string2 (move character procedure)
           
                        | abcdef                       | --------------------->  | ghijk                      |


          -mvc moves character by character 
          -What if you're moving string one byte to the left or right?  If you stop/resume in the middle, will get different results
      -Need page fetch, page replacement

** Page fetch algorithm
      -@Demand Paging: start up process with no pages loaded 
         -load a page when a page fault occurs (when it absolutely MUST be in memory)
      -@Request paging:  let user say which pages are needed
          -This is bad because users don’t know best, and aren’t always impartial.  
          -They will usually overestimate.
          -Overlays are even more draconian than request paging
      -Still need demand paging, in case the user forgets to bring the page into memory

** Prefetching or Prepaging: bring a page into memory before it is referenced
      -Advantages
          -Bring in several pages at once
          -Eliminates real time delay in waiting queue
      -Idea is to guess at which page will be needed.  
          -Hard to do effectively without a projection, may spend a lot of time doing wasted computations
          -Seldom works because you just can't predict the future
      -Can do swapping where you swap in new pages when the process switches

** Overlays
      -A technique by which the user divides his program into segments.  
      -The user issues commands to load and unload the segments from memory.  
      -These commands specify the location in memory where the segments are placed.  
      -This is used when there is no virtual memory, and the user is given a partition of real memory to work with


****** Page replacement algorithms
      1) Random (Rand): pick any page frame at random
      2) FIFO: throw out the page that has been in memory the longest (simple and first page was fetched is believed to be no longer needed)
      3) LRU: First in First out
            -use the past to predict the future (locality)
            -Frame that was used the least recently is most likely to not be used again in the near future
      4) MIN (or OPT): throw out the page that won’t be used for the longest time into the future 
            -not practical because we can’t predict future
            -This would be the optimal algorithm, if implementable
            -Used for comparison to realistic algorithms

      -Real and Virtual Time
          -@Virtual Time is time as measured by a running process 
               -doesn’t include time that process is blocked (for page faults or resource waiting) 
               -often measured in units of memory references
          -@Real Time: time measured by a real clock
       
      ***Evaluating paging algorithms
             -CPU overhead for page fault
                 -Need to use handler, dispatcher, I/O routines (3000 instructions)
                     -Could be more instructions nowadays
                 -Possible CPU idle while page is being brought in
                 -I/O busy during page transfer
                 -Main memory interference while page is transferred
                 -Real time delay to handle page fault
             -Two metrics for evaluating paging algorithms
                 1)Curve of page faults vs. amount of space used: preferable
                     -Called a “Parachor curve”
                     -Looks like an inverse exponential curve (# Page faults vs. memory allocated)
                 2)Space time product vs. amount of space (to minimize STP)
                     -@STP: integral of amount of space used by program over the time it runs  
                          -Includes time for page faults.  This is the real space time product (area under curve)
                          -Exact formula: 
                                   
                                       integral(0, E(space)) [m(t) dt] 
                      
                           where E is ending time of program, and m(t) is memory used by program at time t (real time)

                          -In discrete time, formula is: 

                                       {sum(0,R,i) [m(i)(1+f(i)*PFT]} 

                           -where R is ending time of program in discrete time (num of memory references) 
                           -I is I’th memory reference 
                           -m(i) is number of pages in memory at i'th reference 
                           -f(i) is indicator function = 0 if no page fault, = 1 if page fault  
                           -PFT = page fault time

                           -Note: First product is virtual space-time product.  Second term adds page fault time

                 -STP can be computed approximately from page fault vs. space curve  
                 -STP = (virtual running time for program(F) + pft * number of page faults) * (mean space occupied by program (n bar))

                 -PFT is time for page fault to be handled
                 -Space time depends on PFT, so is technology dependent

*** Page replacement examples:
    - Page request sequence: 4, 3, 2, 1, 4, 3, 5, 4, 3, 2, 1, 5
    - * indicates Page Fault

    1) LRU with 4 TLB frames (frames ordered by least recently used on bottom)
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  3*  2*  1*  4   3   5*  4   3   2*  1*  5*
          |     4   3   2   1   4   3   5   4   3   2   1
          |         4   3   2   1   4   3   5   4   3   2
          |             4   3   2   1   1   1   5   4   3

               -8 Page Faults

    2) LRU with 3 TLB frames (frames ordered by least recently used on bottom)
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  3*  2*  1*  4*  3*  5*  4   3   2*  1*  5*
          |     4   3   2   1   4   3   5   4   3   2   1
          |         4   3   2   1   4   3   5   4   3   2

               -10 Page Faults

    3) FIFO with 4 TLB frames (frames in no special order)
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  4   4   4   4   4   5*  5   5   5   1*  1
          |     3*  3   3   3   3   3   4*  4   4   4   5*
          |         2*  2   2   2   2   2   3*  3   3   3
          |             1*  1   1   1   1   1   2*  2   2

               -10 Page Faults

    4) FIFO with 3 TLB frames (frames in no special order)
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  4   4   1*  1   1   5*  5   5   5   5   5
          |     3*  3   3   4*  4   4   4   4   2*  2   2
          |         2*  2   2   3*  3   3   3   3   1*  1

               -9 Page Faults (Weird - fewer TLB frames yielded fewer page faults)
                               -We will see certain types of algorithms with which this may happen

    5) MIN/OPT with 4 TLB frames (frames ordered by how soon a page will be used again - for each subset of frames 1-4)
                                  -The higher up a page is, the sooner it will be used again
                                  -Like running 4 algorithms in parallel, one for each subset of frames { (1), (1,2), (1,2,3), (1,2,3,4) }
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  3*  2*  1*  4   3   5*  4   3   2   1*  5
          |     4   4   4   1   4   4   5   5   5   5   1
          |         3   3   3   1   3   3   4   3   2   2
          |             2   2   2   2   2   2   4   3   3

               -6 Page Faults


    6) MIN/OPT with 3 TLB frames (frames ordered by how soon a page will be used again - for each subset of frames 1-4)
                                  -The higher up a page is, the sooner it will be used again
                                  -Like running 4 algorithms in parallel, one for each subset of frames { (1), (1,2), (1,2,3), (1,2,3,4) }
            4   3   2   1   4   3   5   4   3   2   1   5
           ===============================================
          | 4*  3*  2*  1*  4   3   5*  4   3   2*  1*  5
          |     4   4   4   1   4   4   5   5   5   5   1
          |         3   3   3   1   3   3   4   3   2   2

               -7 Page Faults