Due Sunday, July 21, 2013 @ 11:59pm
Goals
This assignment will cover floating points, caches, AMAT, performance, and CALL.
Submission
Put all your answers in hw4.txt
.
Submit your solution by creating a directory named hw4
that contains the file hw4.txt
.
(File names are case-sensitive and the submission program will not
accept your submission if your file names differ at all from those
specified) From within that directory, type submit hw4
. Partners are not allowed on this assignment.
Exercises
Problem 1: Floating Points - 4 pts
For the following questions, we will be referring to the IEEE 32-bit floating point representation except with a 5 bit exponent (bias of 2^5/2 - 1 = 15).- Convert the following -73.75. Write your answer in hexadecimal.
- What's the smallest positive integer (an integer has no decimal points) it CANNOT represent? Leave your answer in decimal/scientific notation.
- What's the smallest positive value it can represent? Leave your answer as the closest power of 2.
- What exponent (in decimal) must the float have to be considered a denorm?
- What about a NAN?
Problem 2: CALL - 3 pts
- As briefly as possible, explain why absolute addressing occurs in the linking stage.
- If we edit one file of a program, under CALL do all other files need to go through the compiler and assembly again as well? Why or why not?
- Our assembly file, 61cRocks.s contains a function label, func:, and a data label, str:. If the label...
- func is used in a beq instruction, which table(s) does it show up in?
- func is used in a j instruction, which table(s) does it show up in?
- str is used in a la instruction, which table(s) does it show up in?
Problem 3: Performance and AMAT - 4 pts
- Given that your program must execute within 1 ms, and that it has 100,000 instructions, and your processor has a clock rate of 2 GHz, what must be the maximum average CPI of your program?
- Two systems with identical architectures run the same set of instructions. System A has a clockrate of 1.1 GHz and an average CPI of 1.5, while System B has a clockrate of 1.4 GHz with an average CPI of 1.2. Which system is faster, and by how much?
- On our system, we have a two-level cache. Our L1 cache has a local hit rate of 90% and a hit time of one cycle. Our L2 cache, however, has a local hit rate of 80%, with a hit time of 10 cycles and a miss penalty of 100 cycles to main memory.
- What is the global miss rate?
- What is the overall AMAT?
- What would be the AMAT if we didn't have an L2 cache? Assume L1$ would have a 100 cycle miss penalty to main memory.
- On a different system, we are using a single level cache with a I$ miss rate of 4% and a D$ miss rate of 6%. The miss penalty to main memory is 80 cycles. If our processor has a CPIbase of 1.5, and 40% of instructions are loads/stores, calculate CPIstall.
Problem 4: Caches - 6 pts
Assume we have 16 bit memory addresses. Our cache holds a total of 64 bytes, and each cache block is 8 bytes.
- If the cache is direct-mapped...
- How many bits are used for the tag?
- Which other bytes would share a block with the byte at address 0xABCD?
- The block containing the byte at 0xABCD is in the cache. What memory accesses are guaranteed to get a cache miss? You may describe what addresses, but be specific.
- If the cache is two-way set associative...
- How many bits are used for the tag?
- Which other bytes would share a block with the byte at address 0xABCD?
- The block containing the byte at 0xABCD is in the cache. What memory accesses are guaranteed to get a cache miss? You may describe what addresses, but be specific.
- Examine the following code.
Assume all array accesses are valid (array is large enough), the cache starts empty, and that our data cache is direct-mapped. v need not be block aligned.
What is the maximum and minimum data cache hit rate for a single call to shiftarray (remember to account for loads AND stores) if...
- j = 1?
- j = 8?
- j = 64?
int j = VALUE_SPECIFIED_ABOVE; void shiftarray(char* v) { for (int i = 0; i < 10; i++) v[i] = v[i+j]; }