In the following diagram, each blank box in the cache represents 8 bits (1 byte) of data. Our memory is byte-addressed, meaning that there is one address for each byte. Compare this to word-addressed, which means that there is one address for each word.

|        |        | Th | e C | Geo   | me     | ry of Cac                                                                                                     | hes        |   |               |   |   |       |  |
|--------|--------|----|-----|-------|--------|---------------------------------------------------------------------------------------------------------------|------------|---|---------------|---|---|-------|--|
| Main   |        |    |     | Index | Offset |                                                                                                               |            |   |               |   |   |       |  |
| Memory | Number | 1  | Men | nory  | y      | CPU                                                                                                           | Number     | 3 | 2             | 1 | 0 |       |  |
|        | 0      | 3  | 2   | 1     | 0      | Cache                                                                                                         | 0          |   |               |   |   |       |  |
| 6      | U      | 7  | 6   | 5     | 4      |                                                                                                               | 1          |   |               |   |   |       |  |
| 5      |        | 11 | 10  | 9     | 8      |                                                                                                               |            |   |               |   |   | 1     |  |
| 4      | 1      | 15 | 14  | 13    | 12     | Tag bits                                                                                                      | Index bits |   | 6 Offset bits |   |   | Total |  |
| -      |        |    |     |       |        |                                                                                                               |            |   |               |   |   | 32    |  |
| 3      | 2      | 19 | 18  | 17    | 16     |                                                                                                               |            |   |               |   |   |       |  |
| 2      | 2      | 23 | 22  | 21    | 20     | 1  word = 4  bytes = 32  bits                                                                                 |            |   |               |   |   |       |  |
| 1      | 3      | 27 | 26  | 25    | 24     | Index bits = $\log_2(\text{Number of index row})$<br>Offset bits = $\log_2(\text{Number of offsets columns})$ |            |   |               |   | , |       |  |
| 0      | 3      | 31 | 30  | 29    | 28     |                                                                                                               |            |   |               |   |   |       |  |

## **1.** Direct mapped caches

- 1. How many bytes of data can our cache hold? \_\_\_\_\_ How many words? \_\_\_\_\_
- 2. Fill in the "Tag bits, Index bits, Offset bits" with the correct T:I:O breakdown according to the diagram.
- 3. Let's say we have a 8192KiB cache with an 128B block size, what is the tag, index, and offset of 0xFEEDF00D?

| FE   |      | E    | D    | F    | 0    | 0D   |      |  |
|------|------|------|------|------|------|------|------|--|
| 1111 | 1110 | 1110 | 1101 | 1111 | 0000 | 0000 | 1101 |  |

Tag: \_\_\_\_\_ Index: \_\_\_\_\_ Offset: \_\_\_\_

4. Fill in the table below. Assume we have a write-through cache, so the number of bits per row includes only the cache data, the tag, and the valid bit.

| Address size<br>(bits) | Cache size | Block size | Tag bits | Index bits | Offset bits | Bits per row |
|------------------------|------------|------------|----------|------------|-------------|--------------|
| 16                     | 4KiB       | 4B         |          |            |             |              |
| 32                     | 32KiB      | 16B        |          |            |             |              |
| 32                     |            |            | 16       | 12         |             |              |
| 64                     | 2048KiB    |            |          | 14         |             | 1068         |

## 2. Cache hits and misses

Assume we have the following cache. Classify each of the following byte memory accesses as a cache hit (H), cache miss (M), or cache miss with replacement (R).

|              | Index  | Offset |   |   |   |   |   |   |   |  |
|--------------|--------|--------|---|---|---|---|---|---|---|--|
| CPU<br>Cache | Number | 7      | 6 | 5 | 4 | 3 | 2 | 1 | 0 |  |
|              | 0      |        |   |   |   |   |   |   |   |  |
|              | 1      |        |   |   |   |   |   |   |   |  |
|              | 2      |        |   |   |   |   |   |   |   |  |
|              | 3      |        |   |   |   |   |   |   |   |  |

1. 0×0000004 2. 0×0000005 3. 0×00000068 4. 0×000000C8 5. 0×000000DD 6. 0×00000045 7. 0×0000004 8. 0×00000C8

Self check: Of the 32 bits in each address, which bits do we use to find the *row* of the cache to use?

## 3. Analyzing C Code

```
#define NUM_INTS 8192
int A[NUM_INTS]; /** A lives at 0x100000 */
int i, total = 0;
for (i = 0; i < NUM_INTS; i += 128) { A[i] = i; } /** Line 1 */
for (i = 0; i < NUM_INTS; i += 128) { total += A[i]; } /** Line 2 */</pre>
```

Let's say you have a byte-addressed computer with a total address space of 1MiB. It features a 16KiB CPU cache with 1KiB blocks.

- 1. How many bits make up a memory address on this computer?
- 2. What is the T:I:O breakdown? \_\_\_\_\_\_ tag bits \_\_\_\_\_\_ index bits \_\_\_\_\_\_ offset bits
- 3. Calculate the cache hit rate for the line marked Line 1: \_\_\_\_\_
- 4. Calculate the cache hit rate for the line marked Line 2: \_\_\_\_\_

## 4. Average Memory Access Time

AMAT is the average (expected) time it takes for memory access. It can be calculated using this formula:

AMAT = hit time + miss rate × miss penalty

Remember that the miss penalty is the *additional* time it takes for memory access in the event of a cache miss. Therefore, a cache miss takes hit time + miss penalty time.

- 1. Suppose that you have a cache system with the following properties. What is the AMAT?
  - a) L1\$ hits in 1 cycle (local miss rate 25%)
  - b) L2\$ hits in 10 cycles (local miss rate 40%)
  - c) L3\$ hits in 50 cycles (global miss rate 6%)
  - d) Main memory hits in 100 cycles (always hits)