CS 162 Lecture Notes Prof. Alan Jay Smith Topic: Sharing Main Memory -- Segmentation and Paging + How do we allocate memory to processes? + 1. Simple uniprogramming with a single segment per process: + One program in memory at a time. (Can actually multipro- gram by swapping programs.) + Highest memory holds OS. + Process is allocated memory starting at 0 (or J), up to (or from) the OS area. + Process always loaded at 0. + Examples: early batch monitors where only one job ran at a time and all it could do was wreck the OS, which would be rebooted by an operator. Many of today's personal computers also operate in a similar fashion. + Advantages + Low overhead + Simple + No need to do relocation. Always loaded at zero. + Disadvantages + No protection - process can overwrite OS + which means it can get complete control of system + Multiprogramming requires swapping entire process in and out + Overhead for swapping - .1 - + Idle time while swapping + Process limited to size of memory + CTSS ("compatible" time sharing system), and how system swapped users completely. + No good way share - only one process at a time (can't even overlap CPU and I/O, since only one process in memory.) + 2. Relocation - load program anywhere in memory. + Idea is to use loader or linker to load program at an ar- bitrary memory address. + Note that program can't be moved (relocated) once loaded. (WHY??) + This scheme (#2) is essentially the same as #1, but the ability to load at any address will be used in #3 below. + 3. Simple multiprogramming with static software relocation, no protection, one segment per process: + Highest or lowest memory holds OS. + Processes allocated memory starting at 0 (or N), up to the OS area. + When a process is initially loaded, link it so that it can run in its allocated memory area + Can have several programs in memory at once, each loaded at a different (non overlapping) address. + Advantages: + Allows multiprogramming without swapping processes in - .2 - and out. + Makes better use of memory + Higher CPU use due to more efficient multiprogram- ming. + Disadvantages + No protection - jobs can read or write others. + External fragmentation + Overhead for variable size memory allocation. + Still limited to size of physical memory. + Hard to increase amount of memory allocation. + Programs are staticly loaded - are tied to fixed lo- cations in memory. Can't be moved or expanded. If swapped out, must be swapped to same location. + 4. Dynamic memory relocation: instead of changing the ad- dresses of a program before it's loaded, change the address dynamically during every reference. + Figure of a processor and a memory box, with a memory re- location box in between. + There are many types of relocation - to be discussed. + Under dynamic relocation, each program-generated address (called a logical or virtual address) is translated in hardware to a physical, or real address. This happens as part of each memory reference. + Virtual (logical) Address is what the program gen- erates. + Virtual address space is set of (legal) virtual - .3 - addresses the program can generate. + Physical (real) addresses - set of addresses in phy- sical memory. + Physical address space of program - set of physi- cal addresses it can get to. + Physical address space of machine - set of ad- dresses in physical memory. + Dynamic relocation leads to two views of memory, called address spaces. We have the virtual address space and the real address space. Each process has its own virtual address space. With static relocation we force the views to coincide. In some systems, there are several levels of mapping. + Several types of dynamic relocation. + Base & bounds relocation: + Two hardware registers: base register for process, bounds register that indicates the last valid address the process may generate. + Real address = base + virtual address + IF virtual_address < bounds, and VA>= 0 + In parallel, the real address is generated by adding it to the base register. + This is a form of translation. + Discuss why comparison is done in parallel. + On each memory reference, the virtual address is compared - .4 - + Advantages: + Each process appears to have a completely private memory of size equal to the bounds register plus 1. + Processes are protected from each other. + No address relocation is necessary when a process is loaded. + Task switching is very cheap, when done between processes in memory- just reload processor registers. + Higher overhead to load process from disk. + Compaction is possible. + Disadvantages: + Still limited to size of main memory. + External fragmentation (between processes) + Overhead for allocating variable size spaces in memory. + Sharing difficult - only possible if bases & bounds overlap. + Only one "segment" - i.e. one region of memory. + New, special hardware needed for relocation. + Time to do relocation (it isn't free). + OS must be able to change value of relocation registers (why?). + OS loads new process and sets base and bounds regis- ters. + OS schedules process, and sets base and bounds regis- ter. - .5 - + When tasks are switched, must be able to swap base, bounds and PC registers simultaneously. + These imply that OS must run with base and bounds re- location turned off - otherwise, would affect itself when running. (Or would need its own set of base and bounds registers.) + Use of base and bounds controlled by status bit, usu- ally in PSW or SSW, or similar control register. + Users must not be able to change values of base and bounds registers + Otherwise, no protection between users. Can trash others or OS. + Problem: how does OS regain control once it has given it up? + OS is entered on trap (including SVC) or interrupt. + When OS is entered, use of base and bounds must be disabled. (I.e. bit in PSW is reset.) + Typically, trap handler loads new control register values. + Base & bounds is cheap -- only 2 registers -- and fast -- the add and compare can be done in parallel. + Examples: CRAY-1. IBM 7040/7090. + Can consider three types of systems using base and bounds re- gisters: + Uniprogramming - single user region. Bring a user in, and run him. - .6 - + Multiprogramming with Fixed Partitions - (OS/MFT) - par- tition memory into fixed regions (may be different sizes). User goes into region of given size. + Not very flexible. + IBM OS circa 1965-68 + Multiprogramming with Variable Partitions (OS/MVT) - par- titions are dynamically variable. + IBM OS circa 1967-72. + Note that we can do any of the three above schemes without base and bounds registers - just load programs into region at appropriate base address. + Task Switching + We can now switch between processes very cheaply - don't have to reload memory, just change contents of process control block (which now has values of base and bounds registers). + We can also run processes which are not in memory - how? + Find empty area of memory in which to place process - how?. + Remove one or more processes from memory, if necessary, in order to find space. + (I.e. copy the removed processes to space on disk.) + Copy new process (from disk) into memory. + If only one process fits in memory, have to wait for swap - .7 - to take place. + If several processes fit in memory, can swap one while executing the other. + 5. Multiple segments - Segmentation. + Divide virtual address space into several "segments". + This is not the same as the "segments" of linkers and loaders. + Use a separate base and bound for each segment, and also add protection bits (read, write, execute), and valid bit. (Also will want dirty bit.) + Each address is now + Each memory reference indicates a segment and offset in one or more of three ways: + Top bits of address select segment, low bits the offset. This is the most common, and the best. + Or, segment is selected implicitly by the instruction (e.g. code vs. data, stack vs. data, which base re- gister is used, or 8086 prefixes). + Instruction specifies directly or indirectly a base register for the segment. + Subprograms (procedures, functions, etc.) can be separate segments. + Segments typically are associated with logical partitions of your process address space - e.g. code, data stack. Or, each module or procedure can be a separate segment. - .8 - + Need either segment table or segment registers to hold the base and bounds for each segment. + Draw picture of segment table, with segment table en- tries. + Memory mapping procedure consists of table lookup + add + compare. + Example: PDP-10 with high and low segments selected by high-order address bit. + Address translation for segmentation + Have segment table - maps segment number to [Segment base address, segment length (limit), protection bits, valid bit, reference bit, dirty bit] + This info is in Segment Table Entry (STE) + Diagram of segment table. + Segment descriptor + Need some hardware to automatically map virtual (segment number, word number) to real address. + Real address = segment_table(segment #) + word number. + Invalid if word_number > limit. (Note that we do test without adding bound to both) + Also valid bit must be on, and permission bits must permit access. + Need more hardware to make it go fast (discuss later) + Have Segment Table Base Register (STBR) point to base of segment table (for hardware to use) + Alternate approach - if there are a small number of segments, - .9 - can have segment registers - one register per segment. + Can also multiplex a small number of segment registers among a large number of segments (as with X86 architecture) + Advantages + Each process has own virtual address space + Protection between address spaces + Separate protection between segments (R/W/E) + Virtual space can be larger than physical memory + Unused segments don't need to be loaded Can load segments as needed. + Attempt to reference missing segment called segment fault. + Discuss segment faults later. + Can share one or more segments + sharing is tricky - we'll talk about this later + Segments can be placed anywhere in memory that they fit. + Memory compaction easy. + Segment sizes can be changed independently. + Disadvantages + Each segment must be allocated contiguously. + Segment size < memory size + External fragmentation + Overhead of allocating memory + Need hardware for address translation + Overhead (time/hardware) of doing address translation - .10 - + More complicated. + Space for segment table. + Note that segment tables are usually 1-1 with processes. A segment table defines a process's address space. + What would happen if all processes shared the segment table? + Protection is a problem + Have same problem as before - now we have to allocate shared virtual instead of shared physical memory. + When we switch processes, we reload the STBR (segment table base register), which changes address space. + Processes vs. Threads + A process is a single flow of control associated one to one with an address space. + A thread is a single flow of control. There may be several threads within an address space. + Threads are considered lightweight, because the over- head of creating a thread is usually much less than that to create a process. Cost to communicate between threads in same address space is very low. Cost to communicate between different address spaces is high (e.g. pipe, file). + Threads in one address space share code and data. Threads do not usually share stack - usually each has its own. Can synchronize using P and V without in- - .11 - volving the operating system. + Processes do not normally share, so P&V must use OS as intermediary. + To use threads, usually have constructs like fork, join, signal, wait, broadcast. + Managing segments: + Keep copy of segment table in process control block (or if block is too small, associated with it). + When creating process, define segments in segment table/PCB. + When process is assigned memory, figure out where each segment goes, and put base and bounds into segment table. + Need memory map, which maps memory to segments. (Segment table maps segments to memory.) Also called core map + When switching contexts, save segment table or pointer to it in old process's PCB, reload it from new process's PCB. + When process dies, return segments to free pool. + When there's no space to allocate a new segment: + Compact memory (move all segments, update bases) to get all free space together. + Or, swap one or more segments to disks to make space (must then check during context switching and bring segments back in before letting process run). + To enlarge segment: - .12 - + See if space above segment is free. If so, just up- date the bound and use that space. + Or, move the segment above this one to disk, in order to make the memory free. + Or, move this segment to disk and bring it back into a larger hole (or, maybe just copy it to a larger hole). + Or, move it down, if there is space below. + Can load segments only when needed. + Segment Fault - an attempt to reference a segment which is not present. + Trap to OS + Find space for segment - replace another one, if necessary + Load Segment (remove other segments to make space, if necessary) + Set valid bit==1, and update other entries in STE. + Make process ready. + Paging: goal is to make allocation and swapping easier, and to reduce memory fragmentation. + Make all chunks of virtual memory the same size, call them pages. Typical sizes range from 512-16K bytes. + Divide real memory into page frames, which are the same size as pages. - .13 - + I will frequently be sloppy and say "page" when I mean "page frame". + Virtual Address typically now consists of N bits, parti- tioned as K (page number) and N-K (byte within page). + For each process, a page table defines the base address of each of that process' pages. Each page table entry contains bits for the real address of the page, protec- tion, valid, reference, and dirty bits. + Diagram of page table - see figure + Page table base register points to base of page table. + Translation process: page number always comes directly from the (virtual) address. Since page size is a power of two, no comparison or addition is necessary. Just do table lookup and bit substitution. + Diagram of translation process + No limit field is needed or used. (just overflow to next page) + We will need a table (page map or core map) or memory map telling us who owns which page frame in memory. Points back to any page table that points to this page. + Not all of a process' memory has to be loaded into real memory. If one attempts to reference a location not in memory, it is prevented by a page fault - this condition is detected by the valid bit. + Same as before with segment fault. - .14 - + Page fault - trap condition. Detected by hardware when valid bit is off. + Trap to OS (trap, not interrupt) + OS finds page frame, (somehow - discussed later) + gets page, (reads from disk) + updates page table, + make process ready. + Pages and Paging are used to produce a physical partitioning of the process address space and memory. There usually isn't any relation between page boundaries and what is in a page. + Advantages + Easy to allocate: keep a free list of available page frames and grab the first one. + No external fragmentation. + When combined with segmentation (discussed later): Non- contiguous allocation of segments. + Permits process to have virtual space much larger than physical space. + Permits pages to be loaded as/when needed. + Disadvantages + Internal fragmentation: page size doesn't match up with information size. The larger the page, the worse this is. - .15 - + Hardware for address translation. + Time for address translation. + Page faults may cause considerable overhead. + What happens when we have a page fault (missing page)? - to be discussed later. + Need for page replacement algorithm. + We need algorithms to decide when to move pages into and out of memory. (discussed later). + Table space: if pages are small, the table space could be substantial. In fact, this is a problem even for nor- mal page sizes: consider a 32-bit addresss space with 1k pages. What if the whole table has to be present at once? + 1. Partial solution: keep base and bounds for page table, so only large processes have to have large tables. + 2. Usual solution: make page table two level. (see figure 6) Map high order bits through first table, and lower order page number bits through 2'nd table. + First level table can be called page directory, or segment table (confusing usage). Second level table usually called page table. + 3. Put user page tables in OS virtual memory - then unneeded pages are not allocated. + Note that this yields a 2 level page table - ad- dress is mapped through OS page table and then user page table. - .16 - + 4. Make page table a Hash Table (done by IBM and HP) + Called inverted page table + Efficiency of access: even small page tables are gen- erally too large to load into fast memory in the reloca- tion box. Instead, page tables are kept in main memory and the relocation box only has the page table's base ad- dress. It thus takes one overhead reference for every real memory reference. If page table is two level, re- quires two extra references. + Where are the page tables? + Page tables are either referred to with real addresses, or OS virtual addresses. + Cannot be put where users can get to them. + Otherwise, users could change values, which would bypass protection. + Page table entries are usually real addresses (including addresses of first level page table, and PTBR.) + Could have OS virtual addresses in entries, which means that another level of translation is needed. + Is the OS paged? + Yes - advantages for users also apply to OS. + Can page tables be paged out? + Sure - why not? + But if page tables are in OS's virtual memory, and page tables have OS address space virtual addresses - .17 - in them, then translation of user virtual address also requires OS virtual addr. translation. + This might require a recursive page fault. + Means that OS page tables must be in real memory and use real addresses. + Alternative is to put page tables in "real memory" and use real addresses. (I.e. have V=R). + What can't be paged out? + This is called ``wired down''. + The code that brings in pages. + Pages for critical parts of the operating system. (Han- dling a page fault takes time.) + Some interrupt and trap handlers, including code that starts up a process. + OS page tables + Sensitive real time routines + Pages currently undergoing I/O. (i.e. I/O buffers) + Note how effective paging is for protection - you can only reference parts of memory which appear in your page table(s). The only parts that appear are those that you have access to. + Paging and segmentation combined + Diagram of segment table/ page table mapping. In segment table entry, put protection bits (read, write, execute), - .18 - valid bit. + Each segment broken into one or more pages. + Segments correspond to logical units: code, data, stack. Segments vary in size and are often large. Protection can be associated with segments. + Pages are for the use of the OS; they are fixed-size to make it easy to manage memory. + Going from paging to P+S is like going from single seg- ment to multiple segments, except at a higher level. In- stead of having a single page table, have many page tables with a base and bound for each. Call the stuff associated with each page table a segment. + Advantages: + Provides 2 level mapping (as did page directory and page table). Makes page table size manageable. + Provides both physical unit of management (page) and log- ical unit of management (segment). + Effectively produces two dimensional addressing [segment, address within segment]. + Can grow and shrink segments individually, and without interfering with other segments. Just add pages (which can be anywhere in memory.) + Segmentation with no compaction or fragmentation problem. + Bounds checks on segments handled by having page not be valid. (quantized to page size). + No page table for segment which doesn't exist. - .19 - + Can share segment and/or page. + Protection at level of page and/or segment. + Disadvantages + More complicated than either segmentation or paging. + Overhead of 2 level mapping (time and hardware). + Overhead of both schemes. + Usual internal fragmentation problem, but if page size is small compared to most segments, then internal fragmenta- tion is not too bad. + Paging vs. Segmentation - + page is fixed size, + physical unit of information, + used only for memory management; + not visible to programmer. + Segment is logical unit (usually) + visible to user, + of arbitrary size. + Note that user may see (be aware of) segmentation. User should not be aware of paging. + Can share at two levels: single page, or single segment (whole page table). - Diagram of shared pages or shared seg- ments (shared page table). + Does shared region have to be at same address in each process? - .20 - + No - as long as it can be found. + Can shared region contain any absolute addresses (i.e. virtual addr)? + Usually not - very dangerous - addresses may not be the same in each process. + But can contain relative addresses - eg. offsets to certain registers or segment base. Such registers can be loaded by each process differently. + If entire segment is shared, and addresses are rela- tive to start of segment, we are okay.