This project is due May 11 at 11:59pm. Submit your solution, files named proj4a.c and proj4b.c, as proj4.
You may work in a partnership of 2. The number of grace hours available to a partnership is the average of the grace hours unused by the individual partners. Submit one solution per partnership.
Copy the directory ~cs61cl/code/proj4 to your home directory
and name it "proj4".
It includes several files: sim.c, computer.h,
memory.c, memory.c, bool.h,
proj4a.c, proj4a.h, proj4b.c, proj4b.h,
sample.s, sample.dump,
sample.output
, and makefile.
These are described below.
In this project, you will create an instruction interpreter for a subset of MIPS code. It will fetch, disassemble, decode, and execute MIPS machine instructions, and keep track of the state of the cache. The result will be a miniature version of MARS (without the MARS assembler).
The files sim.c (the top-level file), computer.h, computer.c, proj4a.c and proj4a.h comprise a framework for a MIPS simulator. Complete this part of the program by adding code to proj4a.c, specifically the disassembled and simulateInstr functions. Your simulator must be able to simulate the machine code versions of the following MIPS machine instructions:
addu Rdest, Rsrc1, Rsrc2 addiu Rdest, Rsrc1, imm subu Rdest, Rsrc1, Rsrc2 sll Rdest, Rsrc, shamt srl Rdest, Rsrc, shamt and Rdest, Rsrc1, Rsrc2 andi Rdest, Rsrc, imm or Rdest, Rsrc1, Rsrc2 ori Rdest, Rsrc, imm lui Rdest, imm slt Rdest, Rsrc1, Rsrc2 beq Rsrc1, Rsrc2, raddr bne Rsrc1, Rsrc2, raddr j address jal address jr Rsrc lw Rdest, offset (Radd) sw Rsrc, offset (Radd)
Once complete, your solution program will be able to simulate real programs that do just about anything that can be done on a real MIPS, with the notable exceptions of floating-point math and interrupts.
The files memory.c, memory.h, proj4b.c, and proj4b.h provide a framework for implementing a memory system. In the second part of this assignment, you complete the implementation of the cache, specifically by completing the newCache, cacheContains, cacheContents, updateCacheContents, and addCacheEntry.
Do not change the framework code or add any more source files; just fill in the frameworks where indicated. You may provide additional helper functions.
The file sim.c contains the main function. It parses the command-line options, setting flags that govern how the program interacts with the user. It then calls newComputer in computer.c to create a new simulated MIPS. The newComputer function does the following:
It reads the machine code into "memory", starting at "address" 0x00400000. (In keeping with the MARS convention, addresses from 0x0000000 to 0x00400000 are unused.) We assume that the program will be no more than 1024 words long. The name of the file that contains the code is given as a command-line argument.
It initializes the stack pointer to 0x00404000, it initializes all other registers to 0x00000000, and it initializes the program counter to 0x00400000.
It provides simulated data memory starting at address 0x00401000 and ending at address 0x00404000. It stores instructions together with data in the same memory array.
On return from newComputer, the main function calls simulate in computer.c, entering a loop that repeatedly fetches and executes instructions, printing information as it goes:
the machine instruction being executed, along with its address and disassembled form;
the new value of the program counter;
information about the current state of the registers;
information about the contents of memory.
The framework code supports several command line options:
-i | runs the program in "interactive mode". In this mode, the program prints a ">" prompt and waits for you to type a return before simulating each instruction. If you type a "q" (for "quit") followed by a return, the program exits. If this option isn't specified, the only way to terminate the program is to have it simulate an instruction that's not one of those listed on the previous page. |
-r | prints all registers after the execution of an instruction. If this option isn't specified, only the register that was affected by the instruction should be printed; for a branch, a jump, or a store, which don't affect any registers, the framework code prints a message saying that no registers were affected. (Your code needs to signal when a simulated instruction doesn't affect any registers; see "Details of simulation" below.) |
-m | prints all data memory locations that contain nonzero values after the execution of an instruction. If this option isn't specified, only the memory location that was affected by the instruction should be printed; for any instruction that's not sw, the framework code prints a message saying that no memory locations were affected. (Your code needs to signal when a simulated instruction doesn't affect memory; see "Details of simulation" below.) |
-c | prints the contents of the cache after each instruction. |
-aarg | sets the cache associativity to arg. The default is 1 (i.e., direct-mapped); arg must be a power of 2. |
-sarg | sets the number of cache sets to arg. The default is 4 (i.e. four cache sets); arg must be a power of 2. Each set will have as many entries as the associativity of the cache. |
-barg | sets the cache block size to arg words. If this option is unspecified, the block size is set to 2; arg must be a power of 2. |
-d | is a debugging flag that you might find useful. Any output your program produces other than what is specified in this document should be governed by the -d option. |
Part of your task is to complete the simulateInstr function (perhaps adding auxiliary functions), which simulates the execution of a MIPS instruction. Your simulation of the instructions specified previously should basically mimic their behavior in MIPS and MARS. You may assume that the simulated instructions wil not address instruction memory illegally. If the program encounters an instructon that's not one of the ones listed above, it should quit.
The simulateInstr function takes arguments changedReg and changedMem that govern the output of the simulation. You need to return appropriate values in these arguments.
Another part of the project is to provide the code for the disassembled function, plus any auxiliary functions that are appropriate. You may assume that the simulated instructions will not address instruction memory illegally. If the program encounters an instruction that's not one of the ones just listed, it should quit.
Almost all the printing in the framework code is done in the printInfo function. Also printed is the output of the disassembled function, one of the functions you are to complete, which requires some special attention. Although it just prints the instructions and its operands in text, we will be grading your project with automated scripts. Therefore, the output must follow this part of the specification exactly. Here are the details on the output format:
The disassembled instruction must have the instruction name followed by a "tab" character (In C, this character is '\t'), followed by a comma-and-space separated list of the operations.
For addiu, srl, sll, lw and sw, the immediate value must be printed as a decimal number (with the negative sign, if required) with no leading zeroes unless the value is exactly zero (printed as 0).
For andi, ori, and lui, the immediate must be printed in hex, with a leading 0x and no leading zeroes unless the value is exactly zero (which is printed as 0x0).
For the branch and jump instructions, the target must be printed as a full 8-digit hex number, even if it has leading zeroes. (Note the difference between this format and the branch and jump assembly language instructions that you write.) Finally, the target of the branch or jump should be printed as an absolute address, rather than being PC relative.
All hex values must use lower-case letters and have the leading 0x.
Argument fields must be separated by a comma followed by a single space.
Registers must be identified by number, with no leading zeroes (e.g. $10 and $3) and not by name (e.g. $t2).
As an example, for a store-byte instruction you might return "sb\t$10, -4($21)"
.
If a lw
or sw
instruction accesses an invalid memory address,
your code must print exactly the same
error message as in the contents()
function in computer.c
and then call exit(0)
Here are examples of good instructions:
addiu $1, $0, -2 lw $1, 8($3) srl $6, $7, 3 ori $1, $1, 0x1234 lui $10, 0x5678 j 0x0040002c bne $3, $4, 0x00400044
Here are examples of bad instructions:
addiu $1, $0, 0xffffffff # shouldn't print hex for addiu sw $1, 0x8($3) # shouldn't print hex for sw sll $a1, $a0, 3 # should use reg numbers instead of names srl $6,$7,3 # no spaces between arguments ori $1 $1 0x1234 # forgot commas lui $t0, 0x0000ABCD # hex should be lowercase and not zero extended j 54345 # address should be in hex jal 00400548 # forgot the leading 0x bne $3, $4, 4 # needs full target address in hex
disassembled
must call exit(0)
if an unsupported operation is detected.
Note that the result returned by disassembled is freed in the simulate function. Therefore disassembled must call malloc to reserve memory for this space.
Finally, you are to complete functions that represent operation of the cache. Some requirements:
cacheContains()
,
cacheContents()
, updateCacheContents()
and addCacheEntry()
) should increment
accessCount
exactly once.cacheContains()
should not increment the access count on a cache miss.lastUsedTime
of a cache block should be set to the accessCount
after it has
be incremented.newCache
.valid
bit should change, not the data).addEntry()
must select one to replace
with a valid entry, it should always select the first one (lowest index within the
set).The files sample.s and sample.output in the proj4 directory provide an example output that you may use for a "sanity check". We do not include any other test input files for this project. You must write the test cases in MIPS, use MARS to assemble them, and then dump the binary code.
makefile is a script that builds an executable from a bunch of source and object files. This makefile is provided for your convenience. You do not need to modify it. Simply type "gmake" at the shell prompt to build your simulator.
A first step in testing is to write test code in MIPS. Here are some important guidelines to consider as you write your MIPS test code:
Our simulator starts program memory at 0x400000, data
memory at 0x401000,
and the stack at 0x00404000
. However, MARS
initializes the stack pointer at 0x7fffeffc. When writing test
programs, make certain that this difference will not create any
problems.
MARS places anything that follows the .data assembler directive sequentially in memory. However, this will not be reflected in the binary file that MARS dumps. That dump file only contains instructions. Therefore, instead of depending on MARS to load data memory for you, you should use instructions.
For example, suppose you want to write a MIPS program that uses an array of 5 words called foo which is initialized with the integers 1, ..., 5. Normally, you would write something like:
.data foo: .word 1,2,3,4,5
For this project, you should not use the .data section. Instead you should have your program initialize the array:
__start: lui $t0, 0x0040 ori $t0, 0x1000 addiu $t1, 1 sw $t1, 0($t0) addiu $t1, 2 sw $t1, 4($t0) addiu $t1, 3 sw $t1, 8($t0) addiu $t1, 4 sw $t1, 12($t0) addiu $t1, 5 sw $t1, 16($t0)
You could also do this in a loop. This may seem a bit tedious and time consuming but it greatly simplifies the simulator.
Know what to expect of your tests. For example, if your test program copies a bunch of data from one region of memory to another, you should know what memory is supposed to look like after your program finishes. The command line arguments -r, -m, and -c will be useful for generating output that helps provide evidence of your simulation's correctness.
The procedure for generating input files for the simulator is straightforward. Here are the steps:
Write your test program in MIPS. Make sure you only use supported instructions, and avoid assembler directives that set up .data memory. Suppose your test file is named test0.s. Terminate your program with a single unsupported instruction (e.g., slti). As mentioned previously, attempting to execute an unsupported instruction will cause the simulator to quit.
Debug your MIPS code with MARS.
Dump the code from MARS using the Dump Memory selection from the File menu.
Be sure to select the .text
segment in the dropdown before
doing this, or you might accidentally dump the data memory. You should also
select the Binary
dump format for this project. MARS will dump the binary
instruction into a file of your choosing.
Now test0.dump is a valid input to the simulator. We recommend that you save all of your .s and .dump test files in an orderly fashion. This way, when you think you are done with the simulator, you will have a comprehensive battery of tests to put it through before submitting.
To view a binary dump file in emacs, open it like you normally would and type M-x hexl-mode.