Homework 8: Caches

TA In Charge: Omar Akkawi

Changelog:

Submitting Your Solution

Submit your solution online by 11:59pm on Wednesday 2008-04-23. Do this by creating a directory named hw8 that contains a text file named README, as well the .C files containing your programs for question 7.11 (named "row_major.c" and "col_major.c"). Submit the README file as well as the row_major.c and col_major.c files. From within the hw8 directory, type submit hw8.

Problems

P&H Exercises 7.9, 7.10, 7.11, 7.17, 7.29, 7.39 (pp. 556-558)

7.9: Here is a series of address references given as word addresses: 2, 3, 11, 16, 21, 13, 64, 48, 19, 11, 3, 22, 4, 27, 6, and 11. Assuming a direct-mapped cache with 16 one-word blocks that is initially empty, label each reference in the list as a hit or a miss and show the final contents of the cache.

7.10:Using the series of references given in Exercise 7.9, show the hits and misses and the final cache contents for a direct-mapped cache with four-word blocks and a total size of 16 words.

7.11: Given the following pseudocode:

int array[10000,100000];
for each element array[i][j] {
	array[i][j] = array[i][j] * 2;
}

write two C programs that implememnt this algorithm: one should access the elements of the array in row-major order, and the other should access them in column-major order. Compare the execution times of the two programs. What does this tell you about the effects of memory layout on cache performance?

You will submit both of these programs, so they should have consistent names. The program that accesses the elements in row-major order should be named row_major.c, and the program that accesses the elements in column-major order should be named col_major.c.

Note: You can time programs on a Unix machine (like the lab machines) by using the command: time program_name
Where program_name is the executable file for your program.

7.17:Find the AMAT (Average Memory Access Time = Time for a hit + Miss rate * Miss penalty) for a processor with a 2 ns clock, a miss penalty of 20 clock cycles, a miss rate of 0.05 misses per instruction, and a cache access time (including hit detection) of 1 clock cycle. Assume that the read and write miss penalties are the same and ignore other write stalls.

7.29: Suppose a computer's address size is k bits (using byte addressing), the cache size is S bytes, the block size is B bytes, and the cache is A-way set-associative. Assume that B is a power of two, so B = 2b. Figure out what the following quantities are in terms of S, B, A, b, and k: the number of sets in the cache, the number of index bits in the address, and the number of bits needed to implement the cache (see Exercise 7.12).

7.39: Consider a virtual memory system with the following properties:

  • 40-bit virtual byte address
  • 16 KB pages
  • 36 bit physical byte address What is the total size of the page table for each process on this processor, assuming that the valid, protection, dirty, and use bits take a total of 4 bits and that all the virtual pages are in use? (Assume that disk addresses are not stored in the page table.)