Heng Woon Ong Lecture 03/02/05 PAGING ALGORITHMS How to evaluate paging algorithms: - Two approaches (Metrics) for Eval Paging Algorithms: ::- Curve of page faults vs. amount of space used preferable. (Called "parachor curve") | | | | | \ faults | | | \ | \ | \ | ---------- -------------------- space ::- Space time product vs. amount of space. Want to minimize STP. | | | | | | STP | \ --/ | | -/ | \ -/ | ---/ | --------------------- space ::- Space time product (STP)- integral of amount of space used by program over the time it runs. Includes time for page faults. This is the real space time product. ::- Exact formula is integral(0,E(space)) [m(t) dt], where E is ending time of program, and m(t) is memory used by program at time t (real time). ::- In discrete time, {sum (from i=0 to R) [m(i)(1+f(i)*PFT)]}, where - R is ending time of program in discrete time (i.e. number of memory references), - i is i'th memory reference - m(i) is number of pages in memory at i'th reference, - f(i) is indicator function = 0 if no page fault, =1) if page fault. - PFT = page fault time. ::- Disadvantages in formulation : Variable PFT since PFT is technologically dependent ::- Figuring out page faults for the 3 different algorithms: Memory references 4 3 2 1 4 3 5 4 3 2 1 5 * denotes page fault. LRU: *4 *3 *2 *1 4 3 *5 4 3 *2 *1 *5 4 3 2 1 4 3 5 4 3 2 1 4 3 2 1 4 3 5 4 3 2 4 3 2 1 1 1 5 4 3 FIFO: *4 4 4 4 4 4 *5 5 5 5 *1 1 *3 3 3 3 3 3 *4 4 4 4 *5 *2 2 2 2 2 2 *3 3 3 3 *1 1 1 1 1 1 *2 2 2 OPT: 4 frames OPT is same as min. Remove the page that will not be referenced for longest time in the future. *4 *3 *2 *1 4 3 *5 4 3 *2 *1 5 4 4 4 1 4 4 5 5 5 5 1 3 3 3 1 3 3 4 3 2 2 2 2 2 2 2 2 4 3 3 Number of page faults per algorithm: Memory size | 3 | 4 | ----------- LRU | 10 | 8 | FIFO | 9 | 10 | OPT | 7 | 6 | ----------- Q: will LRU scheme have less page faults if there is less memory? A: No. - Stack Algorithm - An algorithm which obeys the inclusion property - the set of pages in a memory of size N at time t is always a subset of the set of pages in a memory of size N+1 at time t. Obviously cannot have miss ratio increasing with memory size. - Implementing LRU: need some form of hardware support in ord er to keep track of which pages have been used recently. ::- Perfect LRU? Keep a register for each page, and store the system clock into that register on each memory refer- ence. To replace a page, scan through all of them to find the one with the oldest clock. ::- Or, could use linked list to maintain "LRU stack". Note that we can see (by inspection) that with LRU, miss ratio with never increase with increasing number of pages in memory. ::- Perfect LRU painful to implement. Most people use an approximation of LRU. - use bit (reference bit) - a bit in the page table entry (usually cached in the TLB), that is set when the page is referenced. It is turned off under OS control. - Clock algorithm: keep ``use'' bit for each page frame, hardware sets the bit for the referenced page on every memory reference. Have a pointer pointing to the k'th page frame. When a fault occurs, look at the use bit of the page being pointed to. If it is on, turn it off, increment the pointer, and repeat. If it is off, replace the page in that page frame, set use(k)=1. (Clock diagram.) ::- Also called FINUFO - first in, not used, first out. ---- : first reference ==== : second reference o 0 o / / / o ======> 0 o o 0 ::- suppose we use LFU (Least frequently used). WIll that be good or bad? - BAD! Locality changes, so it will be disastrous. - imagine a buffer pool of memory, but the memory is paged. - Per process replacement: A per process replacement algorithm or local page replacement algorithm, or per job replacement algorithm allocates page frames to individual processes: a page fault in one process can only replace one of that process' frames. This relieves interference from other processes. - If all pages from all processes are lumped together by the replacement algorithm, then it is said to be a global replacement algorithm. Under this scheme, each process competes with all of the other processes for page frames. ::- Local algorithm: - Protects jobs from others which are badly behaved. - Hard to decide how much space to allocate to each process. - Allocation may be unreasonable. ::- Global algorithm: - Permits memory allocation for process to shift over time. - Permits memory allocation to adapt to process needs - Permits badly behaved process to grab too much memory. Q: What is used nowadays? A: Usually these days, a global algorithm is used. - Thrashing: A situation when the page fault rate is so high that the system spends most of its time either processing a page fault or waiting for a page to arrive. ::- means that there is too much page fetch idle time when the processor is idle waiting for a page to arrive. ::- Suppose there are many users, and that between them their processes are making frequent references to 50 pages, but memory has 40 pages. ::- Each time one page is brought in, another page, whose contents will soon be referenced, is thrown out. ::- The system will spend all of its time reading and writing pages. It will be working very hard but not getting any thing done. - The progress of the programs will make it look like the access time of memory is as slow as disk, rather than disks being as fast as memory. ::- Thrashing occurs because the system doesn't know when it has taken on more work than it can handle. LRU mechanisms order pages in terms of last access, but don't give absolute numbers indicating pages that mustn't be thrown out. - What do humans do when thrashing? If flunking all courses at midterm time, drop one. ::- Solutions to Thrashing: - If a single process is too large for memory, there is nothing the OS can do. (Buy more memory) - If the problem arises because of the sum of several processes: - Figure out how much memory each process needs. Change scheduling priorities to run processes in groups whose memory needs can be satisfied. - Shed load. - Change paging algorithm - Working Set Paging: ::- Working set = ``the set of pages that a process is work- ing with, and which must thus be resident if the process is to avoid thrashing.'' ::- Formally, ``Exactly that set of pages used in the preceeding T virtual time units'' (T usually given in un- its of memory references.) ::- Choose T, the working set parameter. At any given time, all pages referenced by a process in its last T seconds of execution are considered to comprise its working set. ::- Working Set Paging Algorithm keeps in memory exactly those pages used in the preceding T time units. - Minimum values for T are about 10,000 to 100,000 memory references. ::- A process will never be executed unless its working set is resident in main memory. Pages outside the working set may be discarded at any time. - Note that this requires a reservoir of unassigned page frames. ::- Working set paging requires that the sum of the sizes of the working sets of the jobs eligible to run (which we will call the balance set) be less than or equal to the amount of space available. We previously referred to the balance set as the jobs in the in-memory queue. - What happens if the balance set changes too frequently? + Still get thrashing ::- How different is Working set paging than LRU? - # of page allocations in working set is variable. ::- How do we implement working set? Can it be done exactly? - take advantage of use bits. - OS maintains idle time value for each page: amount of CPU time received by process since last access to page. - Every once in a while, scan all pages of a process. For each use bit on, clear page's idle time. For use bit off, add process' CPU time (since last scan) to idle time. Turn all use bits off during scan. - Scans happen on order of every few seconds (in Unix, ?? is on the order of a minute or more). - Working Set Restoration ::- Idea is that when we remove a process from the in-memory queue, we know what its working set is. ::- When we run the process again (i.e. promote it to the in-memory queue), we can restore the working set to memory all at once.l - Advantages: + minimize CPU overhead + don't have to wait for each page fault -> all transfers at once. + Can optimize layout when writing out, and can fetch from consecutive locations + Or can just sort the fetches, so that average la- tency is much smaller. ::- Basically copy out to disk when not using, and when need to use copy back into memory. - Page Fault Frequency - Let X be the virtual time since the last page fault for this process. ::- At the time of a page fault, [If X>T, remove all pages (of the process) with the use bit off.] Then get a page frame for the new page, and turn off all reference bits for the process. ::- Idea was to make this a quick and easy way to implement working set. Idea is that as long as process is faulting too often (