4. The Nachos Simulated MIPS Machine

Nachos simulates a machine with a processor that roughly approximates the MIPS architecture. In addition, an event-driven simulated clock provides a mechanism to schedule events and execute them at a later time. This is a building block for classes that simulate various hardware devices: a timer, an elevator bank, a console, a disk, and a network link.

The simulated MIPS processor can execute arbitrary programs. One simply loads instructions into the processor's memory, initializes registers (including the program counter, regPC) and then tells the processor to start executing instructions. The processor then fetches the instruction that regPC points at, decodes it, and executes it. The process is repeated indefinitely, until either an instruction causes an exception or a hardware interrupt is generated. When an exception or interrupt takes place, execution of MIPS instructions is suspended, and a Nachos interrupt service routine is invoked to deal with the condition.

Conceptually, Nachos has two modes of execution, one of which is the MIPS simulator. Nachos executes user-level processes by loading them into the simulator's memory, initializing the simulator's registers and then running the simulator. User-programs can only access the memory associated with the simulated processor. The second mode corresponds to the Nachos "kernel". The kernel executes when Nachos first starts up, or when a user-program executes an instruction that causes an exception (e.g., illegal instruction, page fault, system call, etc.). In kernel mode, Nachos executes the way normal Java programs execute. That is, the statements corresponding to the Nachos source code are executed, and the memory accessed corresponds to the memory assigned to Nachos variables.

4.1. Processor Components

The Nachos/MIPS processor is implemented by the Processor class, an instance of which is created when Nachos first starts up. The Processor class exports a number of public methods and fields that the Nachos kernel accesses directly. In the following, we describe some of the important variables of the Processor class; describing their role helps explain what the simulated hardware does.

The processor provides registers and physical memory, and supports virtual memory. It provides operations to run the machine and to examine and modify its current state. When Nachos first starts up, it creates an instance of the Processor class and makes it available through Machine.processor(). The following aspects of the processor are accessible to the Nachos kernel:

Registers

The processor's registers are accessible through readRegister() and writeRegister(). The registers include MIPS registers 0 through 31, the low and high registers used for multiplication and division, the program counter and next program counter registers (two are necessary because of branch delay slots), a register specifying the cause of the most recent exception, and a register specifying the virtual memory address associated with the most recent exception. Recall that the stack pointer register and return address registers are general MIPS registers (specifically, they are registers 29 and 31, respectively). Recall also that r0 is always 0 and cannot be modified.

Physical memory

Memory is byte-addressable and organized into 1-kilobyte pages, the same size as disk sectors. A reference to the main memory array is returned by getMemory(). Memory corresponding to physical address m can be accessed in Nachos at Machine.processor().getMemory()[m]. The number of pages of physical memory is returned by getNumPhysPages().

Virtual memory

The processor supports VM through either a single linear page table or a software-managed TLB (but not both). The mode of address translation is actually used is determined by nachos.conf, and is returned by hasTLB(). If the processor does not have a TLB, the kernel can tell it what page table to use by calling setPageTable(). If the processor does have a TLB, the kernel can query the size of the TLB by calling getTLBSize(), and the kernel can read and write TLB entries by calling readTLBEntry() and writeTLBEntry().

Exceptions

When the processor attempts to execute an instruction and it results in an exception, the kernel exception handler is invoked. The kernel must tell the processor where this exception handler is by invoking setExceptionHandler(). If the exception resulted from a syscall instruction, it is the kernel's responsibility to advance the PC register, which it should do by calling advancePC().

At this point, we know enough about the Processor class to explain how it executes arbitrary user programs. First, we load the program's instructions into the processor's physical memory (i.e. the array returned by getMemory()). Next, we initialize the processor's page table and registers. Finally, we invoke run(), which begins the fetch-execute cycle for the processor.

run() causes the processor to enter an infinite fetch-execute loop. This method should only be called after the registers and memory have been properly initialized. Each iteration of the loop does three things:

  1. It attempts to run an instruction. This should be very familiar to students who have studied the generic 5-stage MIPS pipeline. Note that when an exception occurs, the pipline is aborted.

    1. The 32-bit instruction is fetched from memory, by reading the word of virtual memory pointed to by the PC register. Reading virtual memory can cause an exception.

    2. The instruction is decoded by looking at its 6-bit op field and looking up the meaning of the instruction in one of three tables.

    3. The instruction is executed, and data memory reads and writes occur. An exception can occur if an arithmetic error occurs, if the instruction is invalid, if the instruction was a syscall, or if a memory operand could not be accessed.

    4. The registers are modified to reflect the completion of the instruction.

  2. If an exception occurred, handle it. The cause of the exception is written to the cause register, and if the exception involved a bad virtual address, this address is written to the bad virtual address register. If a delayed load is in progress, it is completed. Finally, the kernel's exception handler is invoked.

  3. It advances the simulated clock (the clock, used to simulate interrupts, is discussed in the following section).

Note that from a user-level process's perspective, exceptions take place in the same way as if the program were executing on a bare machine; an exception handler is invoked to deal with the problem. However, from our perspective, the kernel's exception handler is actually called via a normal procedure call by the simulated processor.

The processor provides three methods we have not discussed yet: makeAddress(), offsetFromAddress(), and pageFromAddress(). These are utility procedures that help the kernel go between virtual addresses and virtual-page/offset pairs.

4.2. Address Translation

The simulated processor supports one of two address translation modes: linear page tables, or a software-managed TLB. While the former is simpler to program, the latter more closely corresponds to what current machines support.

In both cases, when translating an address, the processor breaks the 32-bit virtual address into a virtual page number (VPN) and a page offset. Since the processor's page size is 1KB, the offset is 10 bits wide and the VPN is 22 bits wide. The processor then translates the virtual page number into a translation entry.

Each translation entry (see the TranslationEntry class) contains six fields: a valid bit, a read-only bit, a used bit, a dirty bit, a 22-bit VPN, and a 22-bit physical page number (PPN). The valid bit and read-only bit are set by the kernel and read by the processor. The used and dirty bits are set by the processor, and read and cleared by the kernel.

4.2.1. Linear Page Tables

When in linear page table mode, the processor uses the VPN to index into an array of translation entries. This array is specified by calling setPageTable(). If, in translating a VPN, the VPN is greater than or equal to the length of the page table, or the VPN is within range but the corresponding translation entry's valid bit is clear, then a page fault occurs.

In general, each user process will have its own private page table. Thus, each process switch requires calling setPageTable(). On a real machine, the page table pointer would be stored in a special processor register.

4.2.2. Software-Managed TLB

When in TLB mode, the processor maintains a small array of translation entries that the kernel can read/write using readTLBEntry() and writeTLBEntry(). On each address translation, the processor searches the entire TLB for the first entry whose VPN matches.