CS61C Summer 2013 Lab 5 - Linking, Floating Point, and Caches

Goals

Reading

Setup

Note 1: In this lab we use the commands mips-gcc2, mips-objdump2, and mips-readelf2, which will only run on x86 Linux Machines. As a result, you must be on a machine in the hive lab. If you are in the 200SDH (Orchard) lab or are working remotely, you MUST ssh into one of the hive machines (replace [NUMBER] with a value between 1 and 28):

$ ssh cs61c-XX@hive[NUMBER].cs.berkeley.edu

Note 2: The "2"s at the end of the command names are necessary and are not typos. You will NOT be able to complete the lab using mips-gcc, mips-objdump, and mips-readelf. You MUST use mips-gcc2, mips-objdump2, and mips-readelf2.

Copy the lab files with

$ mkdir ~/lab05 
$ cp ~cs61c/labs/05/* ~/lab05/. 

Background Information

From class, we know that the toolchain executes the following steps to make your executable:

  1. Compile your source files into assembly (compiler)
  2. Assemble the assembly files into object files (assembler)
  3. Combine your object files into an executable (linker)

This section is just for reference. You do not need to execute any commands until Exercise 1 (there is no input.c or input.s file).

Compiler

To create the assembly files which result from the first stage (compilation), you can execute the following:

$ mips-gcc2 -S input.c

Please note the capital S. This will create a file called input.s which contains the assembly for input.c.

Assembler

Assemblers output object files. Object files are binary files that contain the machine code and data that corresponds to the assembly code that you (or a compiler) wrote. Several object files can be linked together to form executable files. Object files and executable files can come in several different formats. The most common one is the Executable and Linking Format (ELF).

To create the object files which result from the second stage (assembly), you can execute either of the following:

$ mips-gcc2 -c input.c
// OR
$ mips-gcc2 -c input.s

Either command will create an object file called input.o.

Linking

To create the final executable from multiple object files, you can just pass the filenames to gcc like this:

$ mips-gcc2 input1.o input2.o -o finalexe

This creates an executable called 'finalexe' that results from linking the two given object files.

Object Dump

The object files are binary files and not readable in an editor, so we need to make use of another utility to view the contents. To view the contents of an object file, you can use mips-objdump2 as follows:

$ mips-objdump2 -x -d input.o > input.o.dump

The -x option tells mips-objdump2 to print information about all of the sections and the -d option tells it to disassemble the instructions in the .text section. Open the .dump files in your favorite text editor and look them over. There will be many fields that you do not understand but you should recognize certain things. In particular, the part that begins with the label "Sections:" tells you the section name, size of the section, virtual memory address of the section (VMA), and some interesting flags (CONTENTS, ALLOC, etc).

Additionally, mips-readelf2 can print out some of the information in the object file in a more human-friendly format. You can do this using:

$ mips-readelf2 -a input.o > input.o.readelf

The file command

The name of a file usually gives you a good idea about the type of a file. For example, you know, usually, a file named input.c should be a C program, while a file input.s is usually an assembly program. But what about a file input.o? The Unix command file can report the type of a file. For example, the following command tells you the type of input.o:

$ file input.o

Exercises

Exercise 1: Assembling and Linking

Part 1: Assembling

When the object files are created, the absolute addresses of functions and data are unknown. Instead, relative addresses are specified in the left most column. Create the file stack.o by the command:

$ mips-gcc2 -c stack.c

Then create an object dump of the resulting stack.o by:

$ mips-objdump2 -x -d stack.o > stack.o.dump

Inspect the file stack.o.dump in your favorite text editor. From class, we know that all pseudo-instructions are replaced in the assembly step, which we just completed. However, you should notice something interesting about pseudo-instructions in the mips-objdump2 output. What is objdump doing? (Hint: decode one of the instructions by hand, for example the move pseudo-instruction.).

What are the addresses of the functions IsEmpty, Push, and Pop in stack.o? Are these addresses relative or absolute addresses? Why?

Create an object dump of teststack.o following similar steps as above. Look at teststack.o.dump. The main function calls several functions that are in the stack.o object file. What instruction does it use to call the functions found in stack.o? What is the significance of the line following each of these function calls (the line containing the word R_MIPS_26)? (Hint: it is used in the linking process)

Part 2: Linking

Finally, let's go ahead and link our program into a single executable. As indicated previously, we can execute the following to achieve this:

$ mips-gcc2 stack.o teststack.o -o linked.o

linked.o is the resulting file after linking the two files mentioned above. Because linked.o does not have any unresolved references, we call it an executable. Like the other object files, opening it in a text editor is not very revealing. Instead, dump it using mips-objdump2 and open the resulting file. Note that there are MANY symbols in this file that we did not define. These are all linked in by default and you may ignore these symbols.

What are the addresses of the functions IsEmpty, Push, and Pop in the executable? Are these addresses relative or absolute addresses? How do you know?

What is the address of the structure ourStack in the final executable? What section (.data, .text, or another section) was the global variable ourStack placed in?

Check-off

From this point on in the lab, we will be working primarily in MARS. If you are in the 200SDH lab and are ssh'd into a hive machine, be sure that you are running MARS locally (not on the hive machine).

Exercise 2: Floating Point

This exercise uses fp.s. This is likely the longest MIPS code you have seen thus far, but don't fret! Most of it is just print formatting. You actually only need to change two numbers at the top of the code (lines 3 and 5).

But as usual, we encourage you to poke around and see what's going on in this code. We're dealing with floating point arithmetic, which uses a whole different set of instructions (ending with .s for single-precision) and registers (found in the "Coproc1" tab on the right).

Part 1: Roundoff

Find ANY positive floating point value x, for which x+1.0=x due to roundoff issues. Verify your result in fp.s. Now find the SMALLEST positive floating point value x for which x+1.0=x. Determine the stored exponent and fraction for x (either on the computer or on paper).

Part 2: Associativity

Using what you have learned from the last part, determine three positive floating point values (not necessarily unique) such that adding these numbers in a different order can yield a different result. Write some code into fp.s to show this (print out 2 different sums from the same 3 floating point numbers). This shows that for three floating point numbers a, b, and c, a+b+c does not necessarily equal c+b+a.

(Hint: Experiment with adding up different amounts of the x value you determined in part 1, and the value 1.0).

Check-off

Exercise 3: Cache Visualization

Caches is typically one of the hardest topics for students in 61C to grasp at first. This exercise will use some cool cache visualization tools in MARS to get you more familiar with cache behavior and performance terminology with the help of the file cache.s and wackycache.s. At this point, read through cache.s to get a rough idea of what the program does.

To get started with each of the scenarios below:

  1. Open the designated file in MARS. (cache.s or wackycache.s)
  2. In the code for cache.s or wackycache.s, set the appropriate Program Parameters as indicated at the beginning of each scenario. (by changing the immediates of the commented li instructions in main)
  3. Run Tools-->Data Cache Simulator.
  4. Set the appropriate Cache Parameters as indicated at the beginning of each scenario.
  5. Enable the Runtime Log and then click "Connect to MIPS".
  6. Run Tools-->Memory Reference Visualizer and click "Connect to MIPS".
  7. As you execute code in MARS, any DATA memory access (load or store) will show up (instruction fetches not shown).

The Data Cache Simulator will show the state of your data cache and the Memory Reference Visualization will show you what parts of memory you are accessing and how many times. Please remember that these are running independently from your code, so if you reset your code, you should also reset the cache and memory!

If you run the code all at once, you will get the final state of the cache and hit rate. You will probably benefit the most from setting breakpoints at each memory access to see exactly where the hits and misses are coming from.

Simulate the following scenarios and record the final cache hit rates. Try to reason out what the hit rate will be BEFORE running the code. After running each simulation, make sure you understand WHY you see what you see (the TAs will be asking)!

DO NOT HESITATE to ask questions if you feel confused! This is perfectly normal and the staff is there to help you out!

Good questions to ask yourself as you do these exercises:

Scenario 1: (using cache.s)

Cache Parameters:

Program Parameters:

Questions:

  1. What combination of parameters is producing the hit rate you observe? (Hint: Your answer should be of the form: "Because parameter A == parameter B")
  2. What is our hit rate if we increase Rep Count arbitrarily? Why?
  3. How could we modify our program parameters to maximize our hit rate?

Scenario 2: (using cache.s)

Cache Parameters:

Program Parameters:

Questions:

  1. Explain the hit rate in terms of the parameters of the cache and the program.
  2. What happens to our hit rate as Rep Count goes to infinity? Why?
  3. Suppose we have a program that uses a very large array and during each Rep, we apply a different operator to the elements of our array (e.g. if Rep Count = 1024, we apply 1024 different operations to each of the array elements). How can we restructure our program to achieve a hit rate like that achieved in this scenario? (Assume that the number of operations we apply to each element is very large and that the result for each element can be computed independently of the other elements.) What is this technique called? (Hint)

Scenario 3:

For the final scenario, we'll be using a new file: wackycache.s. Open it up in MARS and read through the pseudocode to make sure you understand what's happening.

Cache Parameters:

Program Parameters:

Questions:

  1. Run the simulation a few times. Note that the hit rate is non-deterministic. Why is this the case? Explain in terms of the cache parameters and our access pattern. ("The program is accessing random indicies" is not a sufficient answer).
  2. Which Cache parameter can you modify in order to get a constant hit rate? Record the parameter and its value (and be prepared to show your TA a few runs of the simulation). How does this parameter allow us to get a constant hit rate?

Check-off