CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 30: Pipeline Parallelism 1

Instructor:
Dan Garcia
http://inst.eecs.Berkeley.edu/~cs61c/sp13



#### **Boolean Exprs for Controller** RegDst = add + sub ALUSrc = ori + lw + sw MemtoReg = lw RegWrite = add + sub + ori + lw MemWrite = sw = beq nPCsel Jump = jump = lw + swExtOp ALUctr[0] = sub + beq ALUctr[1] = ori (assume ALUctr is 00 ADD, 01 SUB, 10 OR) How do we implement this in gates? Fall 2011 - Lecture #30





#### Review: Single-cycle Processor Five steps to design a processor: 1. Analyze instruction set → datapath requirements Control 2. Select set of datapath components & establish clock methodology Output 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic Formulate Logic Equations · Design Circuits

### Single Cycle Performance

- · Assume time for actions are
  - 100ps for register read or write; 200ps for other events
- Clock rate is?

| Instr    | Instr fetch | Register read | ALU op | Memory access | Register<br>write | Total time |
|----------|-------------|---------------|--------|---------------|-------------------|------------|
| lw       | 200ps       | 100 ps        | 200ps  | 200ps         | 100 ps            | 800ps      |
| sw       | 200ps       | 100 ps        | 200ps  | 200ps         |                   | 700ps      |
| R-format | 200ps       | 100 ps        | 200ps  |               | 100 ps            | 600ps      |
| beq      | 200ps       | 100 ps        | 200ps  |               |                   | 500ps      |

- · What can we do to improve clock rate?
- Will this improve performance as well?
   Want increased clock rate to mean faster programs

## Single Cycle Performance

- · Assume time for actions are
  - 100ps for register read or write; 200ps for other events
- Clock rate is?

| Instr    | Instr fetch | Register read | ALU op | Memory<br>access | Register<br>write | Total time |
|----------|-------------|---------------|--------|------------------|-------------------|------------|
| lw       | 200ps       | 100 ps        | 200ps  | 200ps            | 100 ps            | 800ps      |
| sw       | 200ps       | 100 ps        | 200ps  | 200ps            |                   | 700ps      |
| R-format | 200ps       | 100 ps        | 200ps  |                  | 100 ps            | 600ps      |
| beq      | 200ps       | 100 ps        | 200ps  |                  |                   | 500ps      |

- What can we do to improve clock rate?
- Will this improve performance as well?
   Want increased clock rate to mean faster programs

### Gotta Do Laundry

- Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, fold, and put away
  - Washer takes 30 minutes
  - Dryer takes 30 minutes
  - "Folder" takes 30 minutes
  - "Stasher" takes 30 minutes to put clothes into drawers











### Steps in Executing MIPS

- 1) IFtch: Instruction Fetch, Increment PC
- 2) Dcd: Instruction Decode, Read Registers
- 3) Exec:

Mem-ref: Calculate Address Arith-log: Perform Operation

4) Mem:

Load: Read Data from Memory Store: Write Data to Memory

5) WB: Write Data Back to Register



















# So, in conclusion

- You now know how to implement the control logic for the single-cycle CPU.
  - (actually, you already knew it!)
- Pipelining improves performance by increasing instruction throughput: exploits ILP
  - Executes multiple instructions in parallel
  - Each instruction has the same latency
- Next: hazards in pipelining:
  - Structure, data, control