CS162 Lecture Notes 4/27/2005 Midterm can still be picked up if you still haven’t picked it up. We were talking about virtual machines last time and I gave you an overview. + A virtual machine is a software supported copy of the basic (hardware) machine. + Usually accomplished by allowing most instructions to run on real hardware. + Virtual Machine Monitor (VMM) - the piece of software that provides the pseudo-bare machine interfaces. The way you get decent performance out of a virtual machine is to allow most instructions to run on the actual hardware. + Virtual machines run on base machines that are the same. + Emulators are used to provide dissimilar bare machine interface (i.e. different than the machine underneath). So you’ve heard the Java Virtual machine; it’s not actually a VM, it’s an emulator. You are emulating the Java architecture. Unless you have a machine that’s hard-wired Java and there aren’t too many of those. Emulators have been widely used by computer manufacturers, because they produce new machines that aren’t compatible with the old ones. The paradigm in computer business is that you have to be able to run the old software. They do that by emulating stuff. + Contrasts with operating systems, which provide extended machine interfaces - i.e. they are presumably better than the bare machine. They are suppose to protect you from the bare machine. + Uses of virtual machines + Run several different OS's at once (for different users). + Debug versions of OS while running other users. (inc. diagnostics, etc.) + Develop network software on single machine. + Run multiple OS releases (especially if you haven’t converted your software). + Have students do systems programming. + High reliability due to high isolation between virtual machines + High security (for same reason) Operating System Diagram: webdisk.berkeley.edu/~cpli/lec4-27_01.gif This interface is the operating system interface. Virtual Machine Diagram: webdisk.berkeley.edu/~cpli/lec4-27_02.gif VM monitor replicates copies of the underlying machine, so you can run as many OS’s as you want. And the OS’s provide their own interfaces. Virtual Machine Diagram 2: webdisk.berkeley.edu/~cpli/lec4-27_03.gif And we can do it at as many levels as we want. VM monitor creates VM’s here and there, and they have VM monitors that create more VM’s. At the last level we may or may not have user programs. VM systems are very reliable because they are isolated. Again, you got high security because each one is a standalone machine. Lots of Examples of Virtual Machines: + VM/370, M44/44X (on 7044), CP-40 (on 360/40), CP-67 (on 360/67), Hitac 8400, UMMPS, VM-Ware (Mendel Rosenblum, Berkeley Grad, who recently sold the company.) Last time, Pad asked how we can get decent performance out of this? + Implementation + For performance reasons, run non-sensitive instructions to actually execute on the bare hardware. + Trap and simulate all sensitive instructions - i.e. any which could affect the VMM or any of the other VMs. + If it isn't possible to trap all sensitive instructions, then may not be possible to build VM on that machine. Only things that have to be simulated are supervisor instructions; in that case, the VM monitor figures out what to do. Next we are going to talk about Memory mapping, let’s start with a diagram. Memory mapping diagram 1: webdisk.berkeley.edu/~cpli/lec4-27_04.gif Here is the underlying system. Let’s look at something interesting: OS has a page table right? Memory mapping diagram 2: webdisk.berkeley.edu/~cpli/lec4-27_05.gif We have a page table that maps user address space to the VM. Where do we get physical address from the VM? Another page table. So what the VM monitor is doing is mapping "hardware" addresses, which it thinks is real hardware memory but in reality maps virtual addresses. User maps to the page table, which maps to second page table. So what’s the catch? Why doesn’t it work as I diagram it? Right, there is only 1 TLB. Hardware has no clue about the two level mapping business. Hardware knows about only 1 page table, but the translator doesn’t know anything. So it doesn’t know what to do with 2 page tables. OS doesn’t manage TLB, the hardware does. How do we combine the page tables (without modifying hardware)? Use a third page table that connects user to hardware. Memory mapping diagram 3: webdisk.berkeley.edu/~cpli/lec4-27_06.gif PT3 composes PT1 and PT2, which maps user address to the underlying hardware addresses. TLB only has PT3 entries. However, there is a little bit more to make this work. How happens when we get a page fault? First, VM monitor catches the trap. VM monitor has to figure out if it’s missing from PT1, PT2, or both. It first checks PT2, and if it’s not found, it fixes the PT2. Then check PT1, and if it’s not found there, give it to the OS to fix PT1. After OS fixes PT1, we have to fix PT3. OS has to cause a trap to VM monitor, otherwise PT3 won’t be fixed. Ordinary programs are run on real hardware. You only have to do all this stuff when you have a page fault. Sometimes the OS is a pile of garbage, and VMM is a lot faster. Therefore, sometimes we can let VMM do all the paging. + Must map memory of VM to real machine. Can be done with page tables or base and bounds registers. + The VMM itself may provide a paged machine. In this case, there are two levels of page tables - one from OS to VM, and one from VM to real machine. + In actual operation, will create composed page table to map two tables in one operation. (So built in mapping hardware can be used.) This can work for arbitrary numbers of levels. I/O What we do for memory, we have to do for I/O because we can’t give VM access to the real hardware; therefore, we have to trap I/O’s. + It is necessary to trap and simulate I/O. Let VM monitor remap. + Want to permit I/O only to valid areas for the VM + Without interfering with other VMs + The I/O code (e.g. channel programs) must be properly interpreted or translated, since they use real addresses. (For this reason, self modifying channel programs are usually prohibited, as too difficult to translate.) + I/O devices are usually simulated. Each user is given virtual I/O devices ("mini-disks"), which look like real hardware devices. + The VMM keeps a bit for each VM which specifies whether the VM is in user or supervisor state. It can thus provide appropriate simulations of sensitive instructions. + Attempts to execute sensitive instructions in user state cause abends. In supervisor state, they are executed appropriately. Hardware Virtualizer (in Goldberg article in the reader) + Idea is to have hardware which dynamically maps N levels of virtual machines into the actual hardware. An associative memory is used to keep recently evaluated mappings. The hardware virtualizer would therefore avoid most software simulation. + Define maps which map from VM interface to machine below. Define U map which maps from extended machine (OS) to VM below it. U is provided by OS. The hardware virtualizer composes as many levels of f maps as necessary (these map virtual machine to a machine). VM Performance + Will run slower than real machine due to simulation of sensitive instructions. + Specific performance degradations: + support of privileged instructions + maintaining the status of the virtual processor (user/supervisor) + support of paging within virtual machines + console functions (each VM think it has its own console and an operator sitting at it). + acceptance and reflection of interrupts to individual VMs. + translation of channel programs + maintenance of clocks If you make the clock run slower, it makes you run 10 times faster. + Ways to enhance performance + dedicate some resources, so that they don't have to be mapped or simulated. + give certain critical VMs priority to run + run virtual=real (same as #1) + let VM instead of OS do paging. (If OS does it, it gets done twice.) + modify OS to avoid costly (slow) features + extend VMM to provide performance enhancements (but not truly a VM any more. (e.g. VM assist on 370) + extend hardware to support VM Database called DB2, and DB2 has a buffer pool. DB2 searches buffer pool linearly, if buffer pool didn’t have a page, it brings groups of pages in. Most database programs either run in supervisor state, or bypass big parts of the OS. If you know you’re going to be running a big VM system, you can go into OS and modify anything that would conflict with the VM and implement microcode to support the VM. One of the things I have said is well when this happens, it traps to the VM monitor. Well, it had better trap to the VM monitor because if you have an architecture with no supervisor state and it doesn’t trap when it is needed, you can’t write a VM. + Special performance problems + Optimization within the OS may conflict with optimization within VMM. E.g. double paging anomaly, buffer paging problem of IMS, disk optimization where disk is mapped, spooling by VMM and also by OS. END OF VIRTUAL MACHINES Next topic: Performance Evaluation We are going to talk about the general field of performance and how it applies to general systems. First thing you need with a computer system is that it has to give you the right results. If you get synchronization wrong, it doesn’t matter how fast it runs. Once you get the results right, you want to make it faster. + Real Time Systems are Systems in which there is a real time DEADLINE. + Typically a mechanical system is being controlled. + E.g. assembly line, anti-ballistic missile defense + Real Time System must be able to: + Meet all Deadlines (with 100% or 99+% probability) + Handle the aggregate load. + If there are N events per second, must be able to handle them, whether or not each has a deadline. This implies: + Deadline Scheduler + Avoidance of page faults-generally must lock deadline oriented code into memory. + Avoidance of I/O operations when near deadline is pending. Usually must keep necessary info in electronic memory, and/or fetch it in advance. + This does not imply no cache memory + No matter what people say.... Better to have system that sometimes runs at 5X or 4.9X, rather than one that is guaranteed to run at 1.0X all the time. Performance modeling and measurement is needed through the entire life cycle of a system: design, debugging, installation, tuning, data collection for next system. + Performance evaluation covers these areas: + Measurement (Data) + Analytic Modeling (e.g. stochastic process model, queuing models, etc.) + Simulation Modeling (do it when you can’t build an analytic model) + Tuning (adjusting the system to get better performance) + Design Improvement (fixing it in a more basic way) + Good work in performance evaluation requires a good substantive understanding of the system under study. You can't just arrive with a bag of tricks (e.g. queuing theory) and do something useful. Measurement is probably the hardest part of performance evaluation because getting data is always hard and time-consuming, but important. + Advantage of dealing with something "real." Has all interactions, which would tend to escape the model. + Disadvantage of time and effort to get data (inc. facilities needed). + Not that many published measurement studies. Ways to get data: + Hardware Monitoring + Use some sort of hardware monitor (e.g. logic analyzer) to collect and partially reduce data. + Data is pulled off system with logic probes. + Note that signals have to be available. (Not from middle of chip). May have to design in probe points. + Signals can be used to count events, generate traces, sample, etc. + Hardware monitors typically have some hard logic (e.g. counters and gates) backed up by some programmable control (e.g. a minicomputer). + HW monitors are difficult to get, expensive, and hard to use. Seldom used except by vendors and large computer centers. + Samples should be taken at random times, not regular intervals. The latter will fail to get correct measures of events which occur at regular intervals which are multiples or submultiples of the sampling interval. + Software Measurements + Code running on system is instrumented (e.g. modify page fault handler to count page faults, the catch is you have to have access to the page fault handler). + Can put counters or signalers in OS code, compile it into source code to be studied. Can sample at timer interrupts (e.g. PC). + Note that sampling won't sample something which isn't interruptable. + Can use built in profiling facilities provided by some compilers. + Can instrument the microcode to collect data, if machine is microcoded. + Automatic facilities like compilers and profilers can do much of this. + Can generate significant overhead - e.g. 20% or more. (GTF) + Another catch is if you instrument too much, it may bring down the system performance significantly enough that measurements would be meaningless. END OF LECTURE A week from Monday is last lecture and last midterm. See you Monday.