

## Review: Lottery Scheduling Example

- Lottery Scheduling Example
  - Assume short jobs get 10 tickets, long jobs get 1 ticket

| # short jobs/<br># long jobs | % of CPU each short jobs gets | % of CPU each<br>long jobs gets |
|------------------------------|-------------------------------|---------------------------------|
| 1/1                          | 91%                           | 9%                              |
| 0/2                          | N/A                           | 50%                             |
| 2/0                          | 50%                           | N/A                             |
| 10/1                         | 9.9%                          | 0.99%                           |
| 1/10                         | 50%                           | 5%                              |

- What if too many short jobs to give reasonable response time?
  - » In UNIX, if load average is 100, hard to make progress
  - » One approach: log some user out

#### 10/10/05

#### Lec 12.3

#### **Review: Important Aspects of Memory Multiplexing**

- Controlled overlap:
  - Separate state of threads should not collide in physical memory. Obviously, unexpected overlap causes chaos!
  - Conversely, would like the ability to overlap when desired (for communication)
- Translation:
  - Ability to translate accesses from one address space (virtual) to a different one (physical)
  - When translation exists, processor uses virtual addresses, physical memory uses physical addresses
  - Side effects:
    - » Can be used to avoid overlap
    - » Can be used to give uniform view of memory to programs
- Protection:
  - Prevent access to private memory of other processes

Kubiatowicz CS162 ©UCB Fall 2005

- » Different pages of memory can be given special behavior (Read Only, Invisible to user programs, etc).
- » Kernel data protected from User programs
- » Programs protected from themselves

| Goals for Today                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Dual-Mode Operation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |  |  |  |  |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| <ul> <li>Finish discussion of protection</li> <li>Address Translation Schemes</li> <li>Note: Some slides and/or pictures in the following are adapted from slides ©2005 Silberschatz, Galvin, and Gagne</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | <ul> <li>To Assist with Protection, Hardware provides at least two modes (Dual-Mode Operation): <ul> <li>"Kernel" mode (or "supervisor" or "protected")</li> <li>"User" mode (Normal program mode)</li> <li>Mode set with bits in special control register only accessible in kernel-mode</li> </ul> </li> <li>Intel processor actually has four "rings" of protection: <ul> <li>PL (Priviledge Level) from 0 - 3</li> <li>» PLO has full access, PL3 has least</li> <li>Privilege Level set in code segment descriptor (CS)</li> <li>Mirrored "IOPL" bits in condition register gives permission to programs to use the I/O instructions</li> <li>Typical OS kernels on Intel processors only use PL0 ("user") and PL3 ("kernel")</li> </ul> </li> </ul> |  |  |  |  |  |
| 1/10/05 Kubiatowicz CS162 ©UCB Fall 2005 Lec 12.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 10/10/05 Kubiatowicz C5162 ©UCB Fall 2005 Lec 12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |  |  |  |  |  |
| For Protection, Lock User-Programs in Asylum<br>Idea: Lock user programs in padded cell<br>with no exit or sharp objects<br>- Cannot change mode to kernel mode<br>- User cannot modify page table mapping<br>- Limited access to memory: cannot<br>adversely effect other processes<br>- Side-effect: Limited access to<br>memory-mapped I/O operations<br>(I/O that occurs by reading/writing memory locations)<br>- Limited access to interrupt controller<br>- What else needs to be protected?<br>A couple of issues<br>- How to share CPU between kernel and user programs?<br>- Kinda like both the inmates and the warden in asylum are<br>the same person. How do you manage this??? | How to get from Kernel→User<br>• What does the kernel do to create a new user<br>process?<br>- Allocate and initialize address-space control block<br>- Read program off disk and store in memory<br>- Allocate and initialize translation table<br>» Point at code in memory so program can execute<br>» Possibly point at statically initialized data<br>- Run Program:<br>» Set machine registers<br>» Set hardware pointer to translation table<br>» Set processor status word for user mode<br>» Jump to start of program                                                                                                                                                                                                                            |  |  |  |  |  |

- How do programs interact?
- How does one switch between kernel and user modes?
  - » OS  $\rightarrow$  user (kernel  $\rightarrow$  user mode): getting into cell

```
» User \rightarrow OS (user \rightarrow kernel mode): getting out of cell
kubiatowicz CS162 ©UCB Fall 2005
10/10/05
```

- Save/restore hardware pointer to translation table Mubiatowicz CS162 ©UCB Fall 2005 Lea 10/10/05

- Same saving/restoring of registers as before

Lec 12.8

## User→Kernel (System Call)

- Can't let inmate (user) get out of padded cell on own - Would defeat purpose of protection!
  - So, how does the user program get back into kernel?



- System call: Voluntary procedure call into kernel
  - Hardware for controlled User-Kernel transition
  - Can any kernel routine be called? » No! Only specific ones.
  - System call ID encoded into system call instruction » Index forces well-defined interface with kernel

| 1   | 0  | 1 | n  |     |
|-----|----|---|----|-----|
| - 1 | U/ | 1 | U, | 105 |
|     |    |   |    |     |

```
Kubiatowicz CS162 ©UCB Fall 2005
```

User—Kernel (Exceptions: Traps and Interrupts)

- In fact, often called a software "trap" instruction

- Divide by zero, Illegal instruction, Bus error (bad

- Page Fault (for illusion of infinite-sized memory)

- Hardware enters kernel mode with interrupts disabled

- Saves PC, then jumps to appropriate handler in kernel

- For some processors (x86), processor also saves

• Actual handler typically saves registers, other CPU

· A system call instruction causes a synchronous

• Other sources of synchronous exceptions:

Interrupts are Asynchronous Exceptions

• On system call, exception, or interrupt:

registers, changes stack, etc.

state, and switches to kernel stack

- Segmentation Fault (address out of range)

Examples: timer, disk ready, network, etc....
Interrupts can be disabled, traps cannot!

address, e.g. unaligned access)

exception (or "trap")

Lec 12.9

Lec 12,11

- System Call Continued • What are some system calls? - I/O: open, close, read, write, lseek - Files: delete, mkdir, rmdir, truncate, chown, charp, ... - Process: fork, exit, wait (like join) - Network: socket create, set options • Are system calls constant across operating systems? - Not entirely, but there are lots of commonalities - Also some standardization attempts (POSIX) • What happens at beginning of system call? » Hardware entry to kernel sets system to kernel mode » Handler address fetched from table/Handler started • System Call argument passing: - In registers (not very much can be passed) - Write into user memory, kernel copies into kernel mem » User addresses must be translated!w » Kernel has different view of memory than user - Every Argument must be explicitly checked! 10/10/05 Kubiatowicz CS162 ©UCB Fall 2005 Lec 12,10 Additions to MIPS ISA to support Exceptions? • Exception state is kept in "Coprocessor O" - Use mfc0 read contents of these registers: » BadVAddr (register 8): contains memory address at which memory reference error occurred
  - » Status (register 12): interrupt mask and enable bits
  - » Cause (register 13): the cause of the exception
  - » EPC (register 14): address of the affected instruction

|        | <br>1 | 5    | 8 | 5 | 4   | 3  | 2  | 1 ( | 0  |
|--------|-------|------|---|---|-----|----|----|-----|----|
| Status |       | Mask |   | k | e   | k  | e  | k   | e  |
| _      |       |      |   | ( | old | pr | ev | cu  | ır |

- Status Register fields:
  - Mask: Interrupt enable
    - » 1 bit for each of 5 hardware and 3 software interrupts
  - k = kernel/user: 0⇒kernel mode
  - e = interrupt enable: 0⇒interrupts disabled
  - Exception => 6 LSB shifted left 2 bits, setting 2 LSB to 0:
     » run in kernel mode with interrupts disabled

10/10/05

Kubiatowicz CS162 ©UCB Fall 2005



Kubiatowicz CS162 ©UCB Fall 2005 Lec 12,15

Lec 12,16

#### Simple Segmentation: Base and Limit



- Can use base/limit for dynamic address translation (Simple form of "segmentation"):
  - Alter every address by adding "base"
  - Generate error if address bigger than limit
- $\cdot$  This gives program the illusion that it is running on its own dedicated machine, with memory starting at 0
  - Program gets continuous region of memory
  - Addresses within program do not have to be relocated when program placed in different region of DRAM Kubiatowicz CS162 ©UCB Fall 2005 Lec 12.17

10/10/05

# Base and Limit segmentation discussion

- Provides level of indirection
  - OS Can move bits around behind program's back
  - Can be used to correct if program needs to grow beyond its bounds or coalesce framents of memory
- Only OS gets to change the base and limit!
  - Would defeat protection
- What gets saved/restored on a context switch?
  - Everything from before + base/limit values
  - Or: How about complete contents of memory (out to disk)?
    - » Called "Swapping"
- Hardware cost
  - 2 registers/Adder/Comparator
  - Slows down hardware because need to take time to do add/compare on every access
- Base and Limit Pros: Simple, relatively fast 10/10/05 Kubiatowicz CS162 ©UCB Fall 2005

Lec 12,18

# Cons for Simple Segmentation Method

- Fragmentation problem (complex memory allocation)
  - Not every process is the same size
  - Over time, memory space becomes fragmented
  - Really bad if want space to grow dynamically (e.g. heap)



- Other problems for process maintenance
  - Doesn't allow heap and stack to grow independently
  - Want to put these as far apart in virtual memory space as possible so that they can grow as needed
- Hard to do inter-process sharing
  - Want to share code segments when possible
- Want to share memory between processes 0/05 Kubiatowicz C5162 ©UCB Fall 2005 10/10/05

Lec 12,19

# More Flexible Segmentation



- Logical View: multiple separate segments
  - Typical: Code, Data, Stack
  - Others: memory sharing, etc
- Each segment is given region of contiguous memory - Has a base and limit

10/10/05 Can reside anywhere in physical memory



### Example of segment translation

| 0x240          | main:            | la \$    | a0, varx                   | _    |            |        |        |
|----------------|------------------|----------|----------------------------|------|------------|--------|--------|
| 0x244          |                  | jal      | strlen                     |      | Seg ID #   | Base   | Limit  |
|                |                  |          |                            |      | 0 (code)   | 0x4000 | 0x0800 |
| 0x360<br>0x364 | strlen:<br>loop: | li<br>lb | \$v0, 0 ;c<br>\$t0, (\$a0) | ount | 1 (data)   | 0x4800 | 0x1400 |
| 0x368          | -                | beq      | \$r0,\$t1, d               | lone | 2 (shared) | 0xF000 | 0x1000 |
|                |                  |          |                            |      | 3 (stack)  | 0x0000 | 0x3000 |
| 0x4050         | varx             | dw       | 0x314159                   |      |            |        |        |

Let's simulate a bit of this code to see what happens (PC=0x240):

- Fetch 0x240. Virtual segment #? 0; Offset? 0x240 Physical address? Base=0x4000, so physical addr=0x4240 Fetch instruction at 0x4240. Get "la \$a0, varx" Move 0x4050 → \$a0, Move PC+4→PC
- Fetch 0x244. Translated to Physical=0x4244. Get "jal strlen" Move 0x0248 → \$ra (return address!), Move 0x0360 → PC
- Fetch 0x360. Translated to Physical=0x4360. Get "li \$v0,0" Move 0x0000 → \$v0, Move PC+4→PC
- Fetch 0x364. Translated to Physical=0x4364. Get "lb \$t0,(\$a0)" Since \$a0 is 0x4050, try to load byte from 0x4050 Translate 0x4050. Virtual segment #? 1; Offset? 0x50 Physical address? Base=0x4800, Physical addr = 0x4850,

| Load     | Byte from | $0 \times 4850 \rightarrow $t0$ , Move PC+4 $\rightarrow$ PC |  |
|----------|-----------|--------------------------------------------------------------|--|
| 10/10/05 |           | Kubiatowicz CS162 ©UCB Fall 2005                             |  |

# Example: Four Segments (16 bit addresses)



# Observations about Segmentation

- Virtual address space has holes
  - Segmentation efficient for sparse address spaces
  - A correct program should never address gaps (except as mentioned in moment)
    - » If it does, trap to kernel and dump core
- When it is ok to address outside valid range:
  - This is how the stack and heap are allowed to grow
  - For instance, stack takes fault, system automatically increases size of stack
- $\cdot$  Need protection mode in segment table
  - For example, code segment would be read-only
  - Data and stack would be read-write (stores allowed)
  - Shared segment could be read-only or read-write
- What must be saved/restored on context switch?
  - Segment table stored in CPU, not in memory (small)
  - Might store all of processes memory onto disk when switched (called "swapping")

Lec 12,23













# Multi-level Translation Analysis • With a • Pros: - Only need to allocate as many page table entries as we need for application • With a

- » In other wards, sparse address spaces are easy
- Easy memory allocation
- Easy Sharing
  - » Share at segment or page level (need additional reference counting)
- Cons:
  - One pointer per page (typically 4K 16K pages today)
  - Page tables need to be contiguous
    - » However, previous example keeps tables to exactly one page in size
  - Two (or more, if >2 levels) lookups per reference » Seems very expensive!
- 10/10/05 Kubiatowicz CS162 ©UCB Fall 2005 Lec 12.33 -Often in hardware! 10/10/05 Kubiatowicz CS162 ©UCB Fall 2005 Lec 12.33
  - Closing thought: Protection without Hardware
  - Does protection require hardware support for translation and dual-mode behavior?
    - No: Normally use hardware, but anything you can do in hardware can also do in software (possibly expensive)
  - Protection via Strong Typing
    - Restrict programming language so that you can't express program that would trash another program
    - Loader needs to make sure that program produced by valid compiler or all bets are off
    - Example languages: LISP, Ada, Modula-3 and Java
  - Protection via software fault isolation:
    - Language independent approach: have compiler generate object code that provably can't step out of bounds
      - » Compiler puts in checks for every "dangerous" operation (loads, stores, etc). Again, need special loader.
      - » Alternative, compiler generates "proof" that code cannot do certain things (Proof Carrying Code)

#### - Or: use virtual machine to guarantee safe behavior (loads and stores recompiled on fly to check bounds) 10/10/05 Kubiatowicz C5162 ©UCB Fall 2005 Lec 12.35

# Inverted Page Table



# Summary (1/2)

- Memory is a resource that must be shared
  - Controlled Overlap: only shared when appropriate
  - Translation: Change Virtual Addresses into Physical Addresses
  - Protection: Prevent unauthorized Sharing of resources
- Dual-Mode
  - Kernel/User distinction: User restricted
  - User  $\rightarrow$  Kernel: System calls, Traps, or Interrupts
  - Inter-process communication: shared memory, or through kernel (system calls)
- Exceptions
  - Synchronous Exceptions: Traps (including system calls)'
  - Asynchronous Exceptions: Interrupts

### Summary (2/2)

- Segment Mapping
  - Segment registers within processor
  - Segment ID associated with each access » Often comes from portion of virtual address » Can come from bits in instruction instead (x86)
  - Each segment contains base and limit information » Offset (rest of address) adjusted by adding base
- Page Tables
  - Memory divided into fixed-sized chunks of memory
  - Virtual page number from virtual address mapped through page table to physical page number
  - Offset of virtual address same as physical address
  - Large page tables can be placed into virtual memory
- Multi-Level Tables
  - Virtual address mapped to series of tables
  - Permit sparse population of address space
- Inverted page table

- Size of page table related to physical memory size 10/10/05 Kubiatowicz CS162 ©UCB Fall 2005 Lec 12.37