CS 162 Lecture Notes

Prof. Alan Jay Smith


     Topic: Sharing Main Memory -- Segmentation and Paging


     +   How do we allocate memory to processes?


     +   1. Simple uniprogramming with a single segment per process:

         +   One program in memory at a time.  (Can actually multipro-

             gram by swapping programs.)

             +   Highest memory holds OS.

             +   Process is allocated memory starting at 0 (or J),  up

                 to (or from) the OS area.

             +   Process always loaded at 0.

             +   Examples:  early batch monitors where  only  one  job

                 ran  at  a time and all it could do was wreck the OS,

                 which would be rebooted  by  an  operator.   Many  of

                 today's  personal computers also operate in a similar

                 fashion.

         +   Advantages

             +   Low overhead

             +   Simple

             +   No need to do relocation.  Always loaded at zero.

         +   Disadvantages

             +   No protection - process can overwrite OS

                 +   which means it can get complete control of system

             +   Multiprogramming requires swapping entire process  in

                 and out

                 +   Overhead for swapping


                                  - .1 -


                 +   Idle time while swapping

             +   Process limited to size of memory

                 +   CTSS ("compatible" time sharing system), and  how

                     system swapped users completely.

             +   No good way share - only one process at a time (can't

                 even  overlap  CPU and I/O, since only one process in

                 memory.)


     +   2. Relocation - load program anywhere in memory.

         +   Idea is to use loader or linker to load program at an ar-

             bitrary memory address.

         +   Note that program can't be moved (relocated) once loaded.

             (WHY??)

         +   This scheme (#2) is essentially the same as #1,  but  the

             ability to load at any address will be used in #3 below.


     +   3. Simple multiprogramming with static  software  relocation,

         no protection, one segment per process:

         +   Highest or lowest memory holds OS.

         +   Processes allocated memory starting at 0 (or  N),  up  to

             the OS area.

         +   When a process is initially loaded, link it  so  that  it

             can run in its allocated memory area

         +   Can have several programs in memory at once, each  loaded

             at a different (non overlapping) address.

         +   Advantages:

             +   Allows multiprogramming without swapping processes in


                                  - .2 -


                 and out.

             +   Makes better use of memory

             +   Higher CPU use due to  more  efficient  multiprogram-

                 ming.

         +   Disadvantages

             +   No protection - jobs can read or write others.

             +   External fragmentation

             +   Overhead for variable size memory allocation.

             +   Still limited to size of physical memory.

             +   Hard to increase amount of memory allocation.

             +   Programs are staticly loaded - are tied to fixed  lo-

                 cations  in  memory.  Can't be moved or expanded.  If

                 swapped out, must be swapped to same location.


     +   4. Dynamic memory relocation:  instead of  changing  the  ad-

         dresses  of  a program before it's loaded, change the address

         dynamically during every reference.

         +   Figure of a processor and a memory box, with a memory re-

             location box in between.

         +   There are many types of relocation - to be discussed.

         +   Under dynamic relocation, each program-generated  address

             (called  a  logical  or virtual address) is translated in

             hardware to a physical, or real address.  This happens as

             part of each memory reference.

             +   Virtual (logical) Address is what  the  program  gen-

                 erates.

                 +   Virtual address space is set of  (legal)  virtual


                                  - .3 -


                     addresses the program can generate.

             +   Physical (real) addresses - set of addresses in  phy-

                 sical memory.

                 +   Physical address space of program - set of physi-

                     cal addresses it can get to.

                 +   Physical address space of machine -  set  of  ad-

                     dresses in physical memory.

         +   Dynamic relocation leads to two views of  memory,  called

             address  spaces.    We have the virtual address space and

             the real address space.  Each process has its own virtual

             address space.  With static relocation we force the views

             to coincide. In some systems, there are several levels of

             mapping.


     +   Several types of dynamic relocation.


     +   Base & bounds relocation:

         +   Two  hardware  registers:   base  register  for  process,

             bounds register that indicates the last valid address the

             process may generate.

             +   Real address = base + virtual address

                 +   IF virtual_address < bounds, and VA>= 0

             +   In parallel, the real address is generated by  adding

                 it to the base register.

                 +   This is a form of translation.

                 +   Discuss why comparison is done in parallel.

         +   On each memory reference, the virtual address is compared


                                  - .4 -


         +   Advantages:

             +   Each process appears to  have  a  completely  private

                 memory of size equal to the bounds register plus 1.

             +   Processes are protected from each other.

             +   No address relocation is necessary when a process  is

                 loaded.

             +   Task switching  is  very  cheap,  when  done  between

                 processes in memory- just reload processor registers.

                 +   Higher overhead to load process from disk.

             +   Compaction is possible.

         +   Disadvantages:

             +   Still limited to size of main memory.

             +   External fragmentation (between processes)

             +   Overhead  for  allocating  variable  size  spaces  in

                 memory.

             +   Sharing difficult - only possible if bases  &  bounds

                 overlap.

             +   Only one "segment" - i.e. one region of memory.

             +   New, special hardware needed for relocation.

             +   Time to do relocation (it isn't free).


         +   OS must be able to change value of  relocation  registers

             (why?).

             +   OS loads new process and sets base and bounds  regis-

                 ters.

             +   OS schedules process, and sets base and bounds regis-

                 ter.


                                  - .5 -


             +   When tasks are switched, must be able to  swap  base,

                 bounds and PC registers simultaneously.

             +   These imply that OS must run with base and bounds re-

                 location  turned off - otherwise, would affect itself

                 when running.  (Or would need its own set of base and

                 bounds registers.)

             +   Use of base and bounds controlled by status bit, usu-

                 ally in PSW or SSW, or similar control register.

         +   Users must not be able  to  change  values  of  base  and

             bounds registers

             +   Otherwise, no protection between  users.   Can  trash

                 others or OS.

         +   Problem: how does OS regain control once it has given  it

             up?

             +   OS is entered on trap (including SVC) or interrupt.

             +   When OS is entered, use of base and  bounds  must  be

                 disabled.  (I.e.  bit in PSW is reset.)

             +   Typically, trap handler loads  new  control  register

                 values.

         +   Base & bounds is cheap -- only 2 registers -- and fast --

             the add and compare can be done in parallel.

         +   Examples:  CRAY-1. IBM 7040/7090.


     +   Can consider three types of systems using base and bounds re-

         gisters:

         +   Uniprogramming - single user region.  Bring  a  user  in,

             and run him.


                                  - .6 -


         +   Multiprogramming with Fixed Partitions - (OS/MFT) -  par-

             tition  memory  into  fixed  regions  (may  be  different

             sizes).  User goes into region of given size.

             +   Not very flexible.

             +   IBM OS circa 1965-68

         +   Multiprogramming with Variable Partitions (OS/MVT) - par-

             titions are dynamically variable.

             +   IBM OS circa 1967-72.


     +   Note that we can do any of the three  above  schemes  without

         base and bounds registers - just load programs into region at

         appropriate base address.


     +   Task Switching

         +   We can now switch between processes very cheaply -  don't

             have  to  reload  memory, just change contents of process

             control block (which now has values of  base  and  bounds

             registers).

         +   We can also run processes which are not in memory - how?

             +   Find empty area of memory in which to place process -

                 how?.

                 +   Remove one or  more  processes  from  memory,  if

                     necessary, in order to find space.

                 +   (I.e. copy the  removed  processes  to  space  on

                     disk.)

             +   Copy new process (from disk) into memory.

         +   If only one process fits in memory, have to wait for swap


                                  - .7 -


             to take place.

             +   If several processes fit  in  memory,  can  swap  one

                 while executing the other.


     +   5. Multiple segments - Segmentation.

         +   Divide virtual address space into several "segments".

             +   This is not the same as the "segments" of linkers and

                 loaders.

         +   Use a separate base and bound for each segment, and  also

             add  protection  bits  (read,  write, execute), and valid

             bit.  (Also will want dirty bit.)

     +   Each address is now <segment #, byte in segment>

         +   Each memory reference indicates a segment and  offset  in

             one or more of three ways:

             +   Top bits of address  select  segment,  low  bits  the

                 offset.  This is the most common, and the best.

             +   Or, segment is selected implicitly by the instruction

                 (e.g.  code  vs. data, stack vs. data, which base re-

                 gister is used, or 8086 prefixes).

             +   Instruction specifies directly or indirectly  a  base

                 register for the segment.

             +   Subprograms  (procedures,  functions,  etc.)  can  be

                 separate segments.


     +   Segments typically are associated with logical partitions  of

         your process address space - e.g. code, data stack.  Or, each

         module or procedure can be a separate segment.


                                  - .8 -


         +   Need either segment table or segment  registers  to  hold

             the base and bounds for each segment.

             +   Draw picture of segment table, with segment table en-

                 tries.

         +   Memory mapping procedure consists of table lookup + add +

             compare.

         +   Example:  PDP-10 with high and low segments  selected  by

             high-order address bit.


     +   Address translation for segmentation

         +   Have segment table - maps segment number to [Segment base

             address,  segment  length (limit), protection bits, valid

             bit, reference bit, dirty bit]

             +   This info is in Segment Table Entry (STE)

             +   Diagram of segment table.

             +   Segment descriptor

         +   Need some hardware to automatically map virtual  (segment

             number, word number) to real address.

     +   Real address = segment_table(segment #) + word number.

         +   Invalid if word_number > limit.  (Note that  we  do  test

             without adding bound to both)

             +   Also valid bit must be on, and permission  bits  must

                 permit access.

         +   Need more hardware to make it go fast (discuss later)

         +   Have Segment Table Base Register (STBR) point to base  of

             segment table (for hardware to use)

     +   Alternate approach - if there are a small number of segments,


                                  - .9 -


         can have segment registers - one register per segment.

     +   Can also multiplex a small number of segment registers  among

         a large number of segments (as with X86 architecture)


     +   Advantages

         +   Each process has own virtual address space

         +   Protection between address spaces

         +   Separate protection between segments (R/W/E)

         +   Virtual space can be larger than physical memory

         +   Unused segments don't need to be loaded Can load segments

             as needed.

             +   Attempt to reference missing segment  called  segment

                 fault.

                 +   Discuss segment faults later.

         +   Can share one or more segments

             +   sharing is tricky - we'll talk about this later

         +   Segments can be placed anywhere in memory that they fit.

         +   Memory compaction easy.

         +   Segment sizes can be changed independently.


     +   Disadvantages

         +   Each segment must be allocated contiguously.

         +   Segment size < memory size

         +   External fragmentation

         +   Overhead of allocating memory

         +   Need hardware for address translation

         +   Overhead (time/hardware) of doing address translation


                                  - .10 -


         +   More complicated.

         +   Space for segment table.


     +   Note that segment tables are usually 1-1 with  processes.   A

         segment table defines a process's address space.

         +   What would happen if all  processes  shared  the  segment

             table?

             +   Protection is a problem

             +   Have same problem as before - now we have to allocate

                 shared virtual instead of shared physical memory.

         +   When we switch processes, we  reload  the  STBR  (segment

             table base register), which changes address space.


     +   Processes vs. Threads

         +   A process is a single flow of control associated  one  to

             one with an address space.

         +   A thread is a single  flow  of  control.   There  may  be

             several threads within an address space.

             +   Threads are considered lightweight, because the over-

                 head  of  creating a thread is usually much less than

                 that  to  create  a  process.   Cost  to  communicate

                 between  threads  in  same address space is very low.

                 Cost to communicate between different address  spaces

                 is high (e.g. pipe, file).

             +   Threads in one address space  share  code  and  data.

                 Threads do not usually share stack - usually each has

                 its own.  Can synchronize using P and V  without  in-


                                  - .11 -


                 volving the operating system.

             +   Processes do not normally share, so P&V must  use  OS

                 as intermediary.

             +   To use threads, usually have  constructs  like  fork,

                 join, signal, wait, broadcast.


     +   Managing segments:

         +   Keep copy of segment table in process control  block  (or

             if block is too small, associated with it).

         +   When  creating  process,  define  segments   in   segment

             table/PCB.

         +   When process is assigned memory, figure  out  where  each

             segment goes, and put base and bounds into segment table.

         +   Need memory map, which maps memory to segments.  (Segment

             table maps segments to memory.)  Also called core map

         +   When switching contexts, save segment table or pointer to

             it  in  old  process's  PCB, reload it from new process's

             PCB.

         +   When process dies, return segments to free pool.

         +   When there's no space to allocate a new segment:

             +   Compact memory (move all segments, update  bases)  to

                 get all free space together.

             +   Or, swap one or more segments to disks to make  space

                 (must  then  check during context switching and bring

                 segments back in before letting process run).

         +   To enlarge segment:


                                  - .12 -


             +   See if space above segment is free.  If so, just  up-

                 date the bound and use that space.

             +   Or, move the segment above this one to disk, in order

                 to make the memory free.

             +   Or, move this segment to disk and bring it back  into

                 a  larger  hole  (or,  maybe just copy it to a larger

                 hole).

             +   Or, move it down, if there is space below.


     +   Can load segments only when needed.

         +   Segment Fault - an attempt to reference a  segment  which

             is not present.

             +   Trap to OS

             +   Find space for segment  -  replace  another  one,  if

                 necessary

             +   Load Segment (remove other segments to make space, if

                 necessary)

             +   Set valid bit==1, and update other entries in STE.

             +   Make process ready.


     +   Paging:  goal is to make allocation and swapping easier,  and

         to reduce memory fragmentation.

         +   Make all chunks of virtual memory  the  same  size,  call

             them pages.  Typical sizes range from 512-16K bytes.

         +   Divide real memory into page frames, which are  the  same

             size as pages.


                                  - .13 -


             +   I will frequently be sloppy and  say  "page"  when  I

                 mean "page frame".

         +   Virtual Address typically now consists of N bits,  parti-

             tioned as K (page number) and N-K (byte within page).

         +   For each process, a page table defines the  base  address

             of  each  of  that process' pages.  Each page table entry

             contains bits for the real address of the  page,  protec-

             tion, valid, reference, and dirty bits.

             +   Diagram of page table - see figure

             +   Page table base  register  points  to  base  of  page

                 table.

         +   Translation process:  page number always  comes  directly

             from  the  (virtual) address.  Since page size is a power

             of two, no comparison or addition is necessary.  Just  do

             table lookup and bit substitution.

             +   Diagram of translation process

             +   No limit field is needed or used.  (just overflow  to

                 next page)

             +   We will need a table (page map or core map) or memory

                 map  telling  us who owns which page frame in memory.

                 Points back to any page table  that  points  to  this

                 page.

         +   Not all of a process' memory has to be loaded  into  real

             memory.   If  one attempts to reference a location not in

             memory, it is prevented by a page fault - this  condition

             is detected by the valid bit.

             +   Same as before with segment fault.


                                  - .14 -


             +   Page fault - trap condition.   Detected  by  hardware

                 when valid bit is off.

                 +   Trap to OS (trap, not interrupt)

                 +   OS finds page frame, (somehow - discussed later)

                 +   gets page,  (reads from disk)

                 +   updates page table,

                 +   make process ready.


     +   Pages and Paging are used to produce a physical  partitioning

         of the process address space and memory.  There usually isn't

         any relation between page boundaries and what is in a page.


     +   Advantages

         +   Easy to allocate:  keep a free  list  of  available  page

             frames and grab the first one.

         +   No external fragmentation.

         +   When combined with segmentation (discussed  later):  Non-

             contiguous allocation of segments.

         +   Permits process to have virtual space  much  larger  than

             physical space.

         +   Permits pages to be loaded as/when needed.


     +   Disadvantages

         +   Internal fragmentation:  page size doesn't match up  with

             information  size.   The  larger the page, the worse this

             is.


                                  - .15 -


         +   Hardware for address translation.

         +   Time for address translation.

         +   Page faults may cause considerable overhead.

             +   What happens when  we  have  a  page  fault  (missing

                 page)? - to be discussed later.

         +   Need for page replacement algorithm.

             +   We need algorithms to decide when to move pages  into

                 and out of memory.  (discussed later).

         +   Table space:  if pages are small, the table  space  could

             be substantial.  In fact, this is a problem even for nor-

             mal page sizes:  consider a 32-bit addresss space with 1k

             pages.   What  if  the  whole  table has to be present at

             once?

             +   1. Partial solution:  keep base and bounds  for  page

                 table,  so  only  large  processes have to have large

                 tables.

             +   2. Usual solution: make page table two  level.   (see

                 figure  6)  Map  high order bits through first table,

                 and lower order page number bits through 2'nd table.

                 +   First level table can be called  page  directory,

                     or segment table (confusing usage).  Second level

                     table usually called page table.

             +   3. Put user page tables in OS virtual memory  -  then

                 unneeded pages are not allocated.

                 +   Note that this yields a 2 level page table -  ad-

                     dress  is  mapped  through OS page table and then

                     user page table.


                                  - .16 -


             +   4. Make page table a Hash Table (done by IBM and HP)

                 +   Called inverted page table

         +   Efficiency of access:  even small page  tables  are  gen-

             erally  too large to load into fast memory in the reloca-

             tion box.  Instead, page tables are kept in  main  memory

             and the relocation box only has the page table's base ad-

             dress.  It thus takes one overhead  reference  for  every

             real  memory  reference.  If page table is two level, re-

             quires two extra references.


     +   Where are the page tables?

         +   Page tables are either referred to with  real  addresses,

             or OS virtual addresses.

             +   Cannot be put where users can get to them.

                 +   Otherwise, users could change values, which would

                     bypass protection.

         +   Page table entries are usually real addresses  (including

             addresses of first level page table, and PTBR.)

             +   Could have OS virtual  addresses  in  entries,  which

                 means that another level of translation is needed.

         +   Is the OS paged?

             +   Yes - advantages for users also apply to OS.

         +   Can page tables be paged out?

             +   Sure - why not?

             +   But if page tables are in OS's  virtual  memory,  and

                 page  tables  have OS address space virtual addresses


                                  - .17 -


                 in them, then translation  of  user  virtual  address

                 also requires OS virtual addr. translation.

                 +   This might require a recursive page fault.

                 +   Means that OS page tables must be in real  memory

                     and use real addresses.

             +   Alternative is to put page tables  in  "real  memory"

                 and use real addresses.  (I.e. have V=R).


     +   What can't be paged out?

         +   This is called ``wired down''.

         +   The code that brings in pages.

         +   Pages for critical parts of the operating system.   (Han-

             dling a page fault takes time.)

             +   Some interrupt and trap handlers, including code that

                 starts up a process.

             +   OS page tables

         +   Sensitive real time routines

         +   Pages currently undergoing I/O. (i.e. I/O buffers)


     +   Note how effective paging is for protection -  you  can  only

         reference parts of memory which appear in your page table(s).

         The only parts that appear are those that you have access to.


     +   Paging and segmentation combined

         +   Diagram of segment table/ page table mapping.  In segment

             table  entry, put protection bits (read, write, execute),


                                  - .18 -


             valid bit.

         +   Each segment broken into one or more pages.

         +   Segments correspond to logical units:  code, data, stack.

             Segments  vary  in  size and are often large.  Protection

             can be associated with segments.

         +   Pages are for the use of the OS;  they are fixed-size  to

             make it easy to manage memory.

         +   Going from paging to P+S is like going from  single  seg-

             ment to multiple segments, except at a higher level.  In-

             stead of having a  single  page  table,  have  many  page

             tables  with  a  base and bound for each.  Call the stuff

             associated with each page table a segment.


     +   Advantages:

         +   Provides 2 level mapping (as did page directory and  page

             table).  Makes page table size manageable.

         +   Provides both physical unit of management (page) and log-

             ical unit of management (segment).

         +   Effectively produces two dimensional addressing [segment,

             address within segment].

         +   Can grow and shrink segments  individually,  and  without

             interfering  with  other segments.  Just add pages (which

             can be anywhere in memory.)

         +   Segmentation with no compaction or fragmentation problem.

         +   Bounds checks on segments handled by having page  not  be

             valid.  (quantized to page size).

         +   No page table for segment which doesn't exist.


                                  - .19 -


         +   Can share segment and/or page.

         +   Protection at level of page and/or segment.

     +   Disadvantages

         +   More complicated than either segmentation or paging.

         +   Overhead of 2 level mapping (time and hardware).

         +   Overhead of both schemes.

         +   Usual internal fragmentation problem, but if page size is

             small compared to most segments, then internal fragmenta-

             tion is not too bad.


     +   Paging vs. Segmentation -

         +   page is fixed size,

             +   physical unit of information,

             +   used only for memory management;

             +   not visible to programmer.

         +   Segment is logical unit (usually)

             +   visible to user,

             +   of arbitrary size.


     +   Note that user may see  (be  aware  of)  segmentation.   User

         should not be aware of paging.


     +   Can share at two levels:   single  page,  or  single  segment

         (whole  page table). - Diagram of shared pages or shared seg-

         ments (shared page table).

         +   Does shared region have to be at  same  address  in  each

             process?


                                  - .20 -


             +   No - as long as it can be found.

         +   Can shared region contain any  absolute  addresses  (i.e.

             virtual addr)?

             +   Usually not - very dangerous - addresses may  not  be

                 the same in each process.

             +   But can contain relative addresses - eg.  offsets  to

                 certain registers or segment base. Such registers can

                 be loaded by each process differently.

             +   If entire segment is shared, and addresses are  rela-

                 tive to start of segment, we are okay.