



### **Review: Problems for Computers**

- Limits to pipelining: Hazards prevent next instruction from executing during its designated clock cycle
  - Structural hazards: HW cannot support this combination of instructions (single person to fold and put clothes away)
  - Control hazards: Pipelining of branches & other instructions stall the pipeline until the hazard; "bubbles" in the pipeline
  - <u>Data hazards</u>: Instruction depends on result of prior instruction still in the pipeline (missing sock)



A Carle Summer 2005 © HC

# Review: C.f. Branch Delay vs. Load Delay

- Load Delay occurs only if necessary (dependent instructions).
- Branch Delay always happens (part of the ISA).
- Why not have Branch Delay interlocked?
  - Answer: Interlocks only work if you can detect hazard ahead of time. By the time we detect a branch, we already need its value ... hence no interlock is possible!



CS 61C I 19 Pinelining II (

A Carla Summer 2005 @

#### **FYI: Historical Trivia**

- First MIPS design did not interlock and stall on load-use data hazard
- Real reason for name behind MIPS:
   <u>M</u>icroprocessor without
   <u>Interlocked</u>
   <u>Pipeline</u>
   <u>Stages</u>
  - Word Play on acronym for Millions of Instructions Per Second, also called MIPS
  - Load/Use → Wrong Answer!

Cal

A Carle, Summer 2005 © UCB

# **Outline**

- Pipeline Control
- Forwarding Control
- Hazard Control



A Carle, Summer 2005 © UC







































### In particular:

- ALUinput ← (ALUResult, MemResult)



















Data Hazard: Loads (3/4)

Instruction slot after a load is called "load delay slot"

If that instruction uses the result of the load, then the hardware interlock will stall it for one cycle.

If the compiler puts an unrelated instruction in that slot, then no stall

Letting the hardware stall the instruction in the delay slot is equivalent to putting a nop in the slot (except the latter uses more code space)



# **Hazards / Stalling**

### In general:

- · For each stage i that has reg inputs
  - If I's reg is being written later on in the pipe but is not ready yet
    - Stages 0 to i: Stall (Turn CEs off so no change)
    - Stage i+1: Make a bubble (do nothing)
    - Stages i+2 onward: As usual

### In particular:

ALUinput ← (MemResult)



### **Hazards / Stalling**

#### **Alternative Approach:**

- Detect non-forwarding hazards in decode
  - Possible since our hazards are formal.
    - Not always the case.
  - Stalling then becomes:
    - Issue nop to EX stage
    - Turn off nextPC update (refetch same inst)
    - Turn off InstReg update (re-decode same inst)



A Carle, Summer 2005 © UCB



# **Stall Logic**

- Stall-on-issue is used quite a bit
  - More complex processors: many cases that stall on issue.
  - More complex processors: cases that can't be detected at decode
    - E.g. value needed from mem is not in cache proc must stall multiple cycles



A Carla Summer 2005 © HCI

### By the way ...

- Notice that our forwarding and stall logic is stateless!
- Big Idea: Keep it simple!
  - Option 1: Store old fetched inst in reg ("stall\_temp"), keep state reg that says whether to use stall\_temp or value coming off inst mem.
  - Option 2: Re-fetch old value by turning off PC update.



A Carle, Summer 2005 © UC