Christopher George Lecture February 16th, 2005 TOPIC: Dynamic Memory Relocation, Segmentation, and Paging With static memory relocation, a program is relocated to a spot in memory and it stays there, no further relocation is allowed. How would we go about relocating programs dynamically, at any time? A: Use hardware, re-map addresses on every memory reference. Idea: Every address is input into a relocation box, and the relocation box will map that virtual address to a real address in memory. Input: -------------------- Output: | | Virtual Address | | Real Address | Relocation | -----------------> | Box |----------------> | | | | -------------------- NOTE: Virtual Address are also called Program or Logical Addresses. Real Address may also be referred to as Physical Address. Relocation box can also be referred to as translation or mapping box. TERMS: Logical/Virtual address space is the set of addresses program can generate (usually limited by the byte addressing capabilities, for example 32-bit addressing yields a 4 GB virtual address space). The Real/Physical address space is the set of real locations in physical memory. The machine's physical address space will include all available physical memory, while a program's only contains the ones it can get to. Every process can contain its own virtual address space. The addresses one process generates are different than the address other processes genrate (meaning that the memory they reference isn't shared), yet they may map to the same physical add. because the machine has one physical address space. ****************************************************************************** BASE & BOUNDS RELOCATION (simplest method) Need: A base register, which stores the first location in memory this program's references will map to. A bounds register, which stores the last valid address the process can generate. On every memory reference, the following mapping occurs. Mapping: Real Address = Base Address + Virtual Address But only if the virtual address < Bound & virtual address >= 0. NOTE: The condition test and the addition is done in parallel. Question: Can processes have overlapping address spaces here? Answer: Yes, it can be set up that way. NOTE: This method causes a contiguous region of physical address space that corresponds to the virtual address space. Advantages + Each process seems to have its own address space of size bounds + 1, it doesn't have to share with anyone else to its knowledge. + Each process is protected from each other, since the condition tests make sure that each process cannot make a successful reference to memory outside of its given region. + There is no absolute-address relocation during a load, it all happens during the program when a reference is made, there's simply an addition- operation there. + Task switching is cheap, since you only have to change the registers. + However, there's a higher overhead to load from the disk, because you have to load the program into memory. + Compaction is possible Disadvantages + Limited by size of main memory, since we cannot allocate a contiguous block of virtual memory without a corresponding contiguous block of physical memory. + External fragmentation is caused by the allocation of variable size blocks of memory. + The variable size allocation has overhead itself. + Sharing becomes quite difficult, because you have to make the base & bounds regions overlap over a certain point. + The process gets only one block from memory. + Need special hardware to do the relocation. (Software can't do what hardware doesn't support). + Each addition takes time and reduces performance. This method depends on the OS being able to change the base and bounds registers, or else no process could ever be loaded (a new address space couldn't be assigned). + During a load, you have to set the base and bound registers. + During a task switch, the base, bounds, and PC registers need to be swapped at the same time, or else processing could corrupt itself. + The OS must then have base and bounds turned off while it's running, because it needs to address all of memory, plus it could have trouble setting its own base and bounds registers. + Turning on and off base & bounds must then be controlled by a status bit, placed in the process or system status word (a register). Users must not be able to change the B&B registers or else protection goes out the window and a process could crash itself and/or other programs, including the OS. Problem: How does an OS regain control of the CPU from a user process? + It is entered during a trap or system interrupt. + Once it is entered, B&B must be turned off (bit in PSW is set) + Trap handler sets new control register values. This method is cheap, requires only 2 additional registers and adder + comparator, and operations are quite cheap (time-wise) to do. 3 Types of Systems using B&B registers: + Uniprogramming - one user region, he is loaded and run, and OS is sure to be protected. + Multiprogramming with Fixed Partitions (MFT) - memory gets partitioned into fixed regions of fixed sizes, and programs are allocated memory based on size needs. + Inflexible, done because it was cheap and easy + Multiprogramming with Variable Partitions (MVT) - partitions are variable sized All of these methods can be done without base & bounds, but with it we get protection and speed. We now have cheap, easy task switches. There are no large memory reloads, only small PCB changes. It is also possible to now run a process that isn't in memory, you just find an empty space of memory (kick a process out of memory and copy to disk if necessary) and copy the new process in. If only one process fits in memory, you must wait for a swap, but if more than one can fit in memory than a swap can happen during processing of another program. ****************************************************************************** SEGMENTATION Big Idea: Divide the virtual address space into segments (not the same segments as in linkers and loaders), and have a separate base and bound for each segment along with some protection bits (read, write, and execute), a valid bit (to de- termine whether or not a segment can be accessed yet), and a dirty bit (used for checking to see if the segment has been modified.). This allows us to have separate defined segments of memory that we can place anywhere in physical memory, instead of one contiguous block of memory, and we can treat each segment differently depending on their portection bits. Now, an address = . + The actual memory reference can indicate the segment and offset in a number of different ways. + It's possible to have the top X number of bits equal the segment #, and the low bits the offset. This is common, and the best way. + The segment can be implied by the instruction, depending on what data is being manipulated (code like an inst. fetch, data like a lw or sw, or stack like a push or pop) or which registers are being used. + The instruction can specify (directly or not) the base of the segment through a register. + A subprogram (like a procedure, a function, possibly a loop) can have its own special segment. Segments are usually associated with logical partitions of your address space( code, data, and stack, or whatever the software structure is like). + You will need a segment table or segment registers to hold base and bounds for each segment. + The memory map procedure is simply a table lookup, and add, and a compare. STBR ----, Segment Table Virtual Address '------> ------------------- ---------------------------- | | | Segment # | Byte Offset | |-------------------| ---------------------------- | | | |-------------------| | | |<---------' |-------------------| . . . |-------------------| | | --------------------- Segment Table Entry: ------------------------------------------------------------------- | Base | Bound | Protection | Valid | Dirty | Reference | ------------------------------------------------------------------- NOTE: STBR stands for Segment Table Base Register. Address translation for segments: + Your segment table maps a segment number to a [Segment base address, segment length (or the limit), the protection, valid, dirty, and ref- erence bits.] + These are each stored in a Segment Table Entry (STE). + Hardware is needed to do the translation from segment and offset to real address. The real address is equal to the segment_location(segment #) + word offset. + This is only valid if the offset is less than the segment size (This test happens in parallel with the math). + The valid and protection bits must also cooperate. + Even more hardware is needed to speed the process. + Must have a segment table base register to find the base of the seg. table. However, if the number of segments is small, one could simply hold the address of each segment in a set-aside register. You could also multiplex registers among a large number of segments (x86 does this). Advantages of segmentation: + Each process gets its own virtual address space, since it sees all of mem- ory as its own. + There's protection between address spaces, based on limits that keep each process out of the memory of others, since segment table controls what addresses one process can reach. + Protection among segments, derived from the individual read, write, and execute permission bits on each segment. + Virtual space can be larger than physical memory, because OS can segment address space accordingly. + Unused segments don't need to be loaded, only necessary ones. Can be load- ed as deemed necessary by the OS. + An attempt to ref. a missing segment is a "segment fault" + Sharing is possible by sharing segments (identical STE's) + Segments can be placed anywhere in memory where they will fit. + Memory compaction is easy, just move segments around to compact blocks to- gether. + Segment sizes can change independent of the sizes of the other segments. Disadvantages of segmentation: + Each segment needs to be allocated a contiguous segment of memory. + Segment size must be < memory_size + External fragmentation, because we're still allocating variable size segments in a set free storage area. + Still have time and hardware overhead to implement segmentation + This method is more complicated to implement than base & bounds + Need to save space to hold the segment table in memory. Segment tables are usually 1-1 with the number of processes, because you need a segment table to define a process's address space. + If all processes shared the segment table: + Protection is gone. + Back to square one: have to figure out how to allocate the shared virtual memory. + When we switch tasks, make sure we switch segment table base registers, so the process will operate in its own virtual address space. Processes Vs. Threads + A single process is a single executing piece of code with its own address space. + A thread is a single executing piece of code, but they may coexist within the same address space. + Threads earn the term "lightweight," since the overhead of creating a thread is much smaller than for a process, and the cost of inter- thread communication is low, whereas to communicate between processes there is a high expense associated with piping, files, and other methods of communication. + Threads in the same address space may share code and data, but they usually have their own stacks. Semaphores can be used without in- volving the operating system. + Since processes don't share, the OS must get involved to allow proper semaphore use. + Threads usually have constructs like join, fork, signal, wait, and broadcast. Managing segments: + You keep a copy of the segment table in the PCB, or associate the table with the PCB if the PCB is too small. + The segment table gets segments entered into it during process creation. + During process memory assignment, each segment is placed into memory and the base and bounds fields are filled out in the seg. table. + Will need a memory map (or core map), that tells what segment(s) use(s) a piece of memory. + During a task switch, the segment tables are saved and reloaded with the switching of currently running PCBs. + When the owning process dies, the segments are freed, and put back in the free pool. + If no space is available to allocate a segment: + Compact memory to see consolidate smaller free spaces. + Or swap a segment to disk so a new one can be allocated. + To enlarge a segment: + If space at the end of the segment is free, increase the bounds and use that space. + Or, you can swap out the segment after this one to disk to free its space and use it. + Or, move the segment to a place in memory where there is as much space as is desired. + Or, if there is space before the segment, move the segment down by copying the segment down, then resetting the base and bounds reg- isters accordingly. Loading segments on demand (allows more segments than physical memory can hold at one time): + Segment fault - attempt to reach a segment that isn't present. + Trap to OS + OS finds space for segment, swapping out another one if needed + Put the segment back into memory + Set the valid bit to 1, and update STE's as needed. + Ready the process. The problems with segmentation are mainly the variable allocation, which causes external fragmentation, and the limit of having a segment being only as large as physical memory will allow. These are problems that are addressed in the next memory relocation method: paging. ****************************************************************************** PAGING Goal: Reduce memory fragmentation and make allocation and swapping easy. All chunks of memory get made to be the same size (usually between 512 and 16K bytes). Real memory is divided into page frames, which are of equal size as the pages. A virtual address is now N bits, where the first K bits give you the page and the last N-K bits signify the byte offset within the page. For each process, a page table is made that defines the base real addresses of each page. Each Page has a page table entry (PTE) that contains the real add- ress of the page, the protection, valid, dirty, and reference bits. Also need a page table base register which points to the start of the page table. PTBR ----, Page Table Virtual Address '------> ------------------- ---------------------------- | | | Page # | Byte Offset | |-------------------| ---------------------------- | | | |-------------------| | | |<---------' |-------------------| . . . Translated Address |-------------------| ------------------------------ | ---------------|-------> | Real Address | Byte Offset | |-------------------| ------------------------------ | | --------------------- Page Table Entry: ------------------------------------------------------------------------------- | Real Address of base of page | Protection | Valid | Dirty | Reference | ------------------------------------------------------------------------------- NOTE: PTBR stands for Page Table Base Register. Translation: + The page number is taken directly from the virtual address. No comparing or addition is needed, just one table lookup and a bit subsitution. + No limit field is necessary (overflow on one page goes to the next page in the table. + Need a page map (or core map) that says who points to the page frame in memory. It will point back to all page tables that point to its page frame. + We can handle page loading on demand through page faults (requesting data from a missing page). Its process if similar to that of a seg- ment fault: + Trap to OS + OS finds empty page frame to put page into + Put the page from disk into memory + Update the page table + Ready the process. Pages are used to produce a physical partition between a process address space and real memory. 4K of virtual address space will correspond to 4K in memory. There is no relation between pages and what is inside of them. Advantages: + Easy to allocate: all you need is a free list of page frames and you can grab the first one. + External fragmentation is eliminated, because all page frames are of equal size in memory. + Can combine with segmentation to create non-contiguous segment allocation. + Process can now have more address space than physical memory has. + Pages can be loaded on demand Disadvantages: + Internal fragmentation is introduced, pages don't always fill up. This gets worse if the page sizes get big. + Need hardware and time to translate addresses (same software can only do what hardware allows, and hardware can't do things instantly) + Page faults cause significant overhead (disk accesses are expensive) + Need a page replacement algorithm, because pages can be swapped from disk into memory, so how exactly to do that is an issue. + The page table needs to be stored somewhere. Small pages lead to bigger tables, because the number of pages in the table grows quickly.