

#### **Review**

- N-bit adder-subtractor done using N 1bit adders with XOR gates on input
  - · XOR serves as conditional inverter
- CPU design involves Datapath, Control
  - · Datapath in MIPS involves 5 CPU stages
  - 1) Instruction Fetch
  - 2) Instruction Decode & Register Read
  - 3) ALU (Execute)
  - 4) Memory
  - 5) Register Write

CS61C L19 CPU Design : Desi

.....





# For each instruction, how do we control the flow of information though the datapath? • Multiple-cycle CPU: Only one stage of instruction per clock cycle. • The clock is made as long as the slowest stage. 1. Instruction 2. Decode/ Register Read Several significant advantages over single cycle execution: Unused stages in a particular instruction can be skipped OR instructions can be pipelined (overlapped).

How to Design a Processor: step-by-step

- •1. Analyze instruction set architecture (ISA) ⇒ datapath requirements
  - meaning of each instruction is given by the register transfers
  - datapath must include storage element for ISA registers
  - · datapath must support each register transfer
- 2. Select set of datapath components and establish clocking methodology
- •3. Assemble datapath meeting requirements
- 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.

5. Assemble the control logic (hard part!)





# **Register Transfer Language**

• RTL gives the meaning of the instructions

 $\{op, rs, rt, rd, shamt, funct\} \leftarrow MEM[PC]$ 

 $\{op, rs, rt, Imm16\} \leftarrow MEM[PC]$ 

All start by fetching the instruction

ADDU Register Transfers

ADDU  $R[rd] \leftarrow R[rs] + R[rt];$   $PC \leftarrow PC + 4$ 

SUBU  $R[rd] \leftarrow R[rs] - R[rt];$   $PC \leftarrow PC + 4$ ORI  $R[rt] \leftarrow R[rs] \mid zero\_ext(Imm16);$   $PC \leftarrow PC + 4$ 

LOAD  $R[rt] \leftarrow MEM[R[rs] + sign_ext(Imm16)]; PC \leftarrow PC + 4$ 

STORE MEM[ R[rs] + sign\_ext(Imm16) ]  $\leftarrow$  R[rt]; PC  $\leftarrow$  PC + 4

BEQ if (R[rs] == R[rt]) then  $PC \leftarrow PC + 4 + (sign_ext(Imm16) \parallel 00)$ 

else  $PC \leftarrow PC + 4 + (sign_ext(Hilling) | | 00)$ 

Reamer Summer 2007 © UCR

# **Step 1: Requirements of the Instruction Set**

- Memory (MEM)
  - · instructions & data (will use one for each)
- Registers (R: 32 x 32)
  - · read RS
  - · read RT
  - · Write RT or RD
- PC
- Extender (sign/zero extend)
- Add/Sub/OR unit for operation on register(s) or extended immediate
- Add 4 or extended immediate to PC

Capompare registers?

Beamer, Summer 2007 © UC

# **Step 2: Components of the Datapath**

- Combinational Elements
- Storage Elements
  - · Clocking methodology





#### **ALU Needs for MIPS-lite + Rest of MIPS**

Addition, subtraction, logical OR, ==:

```
ADDU R[rd] = R[rs] + R[rt]; \dots
SUBU R[rd] = R[rs] - R[rt]; \dots
ORI R[rt] = R[rs] | zero ext(Imm16)...
BEQ if ( R[rs] == R[rt] )...
```

- Test to see if output == 0 for any ALU operation gives  $\stackrel{\cdot}{=}$  test. How?
- P&H also adds AND. Set Less Than (1 if A < B, 0 otherwise)
- ALU follows chap 5

#### What Hardware Is Needed? (1/2)

- PC: a register which keeps track of memory addr of the next instruction
- General Purpose Registers
  - · used in Stages 2 (Read) and 5 (Write)
  - · MIPS has 32 of these
- Memory
  - · used in Stages 1 (Fetch) and 4 (R/W)
  - · cache system makes these two stages as fast as the others, on average



# What Hardware Is Needed? (2/2)

#### • ALU

- · used in Stage 3
- · something that performs all necessary functions: arithmetic, logicals, etc.
- · we'll design details later

# Miscellaneous Registers

- · In implementations with only one stage per clock cycle, registers are inserted between stages to hold intermediate data and control signals as they travels from stage to stage.
- Note: Register is a general purpose term meaning something that stores bits. Not all registers are in the "register file".

# **Storage Element: Idealized Memory**

#### Memory (idealized)

· One input bus: Data In

One output bus: Data Out Clk.

DataOut Data In

Write Enable | Address

- · Memory word is selected by:
  - · Address selects the word to put on Data Out
  - · Write Enable = 1: address selects the memory word to be written via the Data In bus

#### Clock input (CLK)

- · The CLK input is a factor ONLY during write operation
- · During read operation, behaves as a combinational logic block:
  - Address valid ⇒ Data Out valid after "access time."



#### **Storage Element: Register (Building Block)**

## Similar to D Flip Flop except

- N-bit input and output
- Write Enable input

# · Write Enable:

- negated (or deasserted) (0): Data Out will not change
- asserted (1): Data Out will become Data In on positive edge of clock



Cal

#### **Storage Element: Register File**

# • Register File consists of 32 registers:

- Two 32-bit output busses: busA and busB
- · One 32-bit input bus: busW



- Register is selected by:
  - · RB (number) selects the register to put on busB (data)

  - RW (number) selects the register to be written via busW (data) when Write Enable is 1

#### Clock input (clk)

- The clk input is a factor ONLY during write operation
- During read operation, behaves as a combinational logic block:
- RA or RB valid ⇒ busA or busB valid after "access time."

#### **Administrivia**

- Assignments
  - · HW5 due Tonight
  - · HW6 due 7/29
- Midterm
  - · Grading standards up
  - · If you wish to have a problem regraded
    - Staple your reasons to the front of the exam
    - Return your exam to your TA
- Scott is now holding regular OH on Fridays 11-12 in 329 Soda



---- C.......... 2007 © UCB

## Step 3: Assemble DataPath meeting requirements

- Register Transfer Requirements

  ⇒ Datapath Assembly
- Instruction Fetch
- Read Operands and Execute Operation

CS61C L19 CPU Design : Designing a Single-Cycle CPU (20)

\_ \_ \_

# 3a: Overview of the Instruction Fetch Unit

- The common RTL operations
  - Fetch the Instruction: mem[PC]
  - Update the program counter:
    - Sequential Code: PC ← PC + 4
    - Branch and Jump: PC "something else"



CS61C L19 CPU Design : Designing a Single-Cycle CPU (21)

mer, Summer 2007 © U

#### 3b: Add & Subtract •R[rd] = R[rs] op R[rt] Ex.: addU rd,rs,rt · Ra, Rb, and Rw come from instruction's Rs, Rt, and Rd fields 31 shamt funct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits ALUCtr and RegWr: control logic after decoding the instruction Rd Rs Rt $\frac{\text{RegWr}}{5} \stackrel{\text{Ku}}{5} \stackrel{\text{Ks}}{5} = \frac{5}{5}$ busA Rw Ra Rb busW 32 32-bit / Result Registers busB Already defined the register file & ALU























# A. For the CPU designed so far, the Controller only needs to look at opcode/funct and Equal

B. Adding jal would only require changing the Instruction Fetch block

**Peer Instruction** 

C. Making our single-cycle CPU multi-cycle will be easy

Cal

0: FFF 1: FFT 2: FTF 3: FTT 4: TFF 5: TFT 6: TTF 7: TTT

ABC

- How to Design a Processor: step-by-step

  1. Analyze instruction set architecture (ISA)

  => datapath requirements
  - meaning of each instruction is given by the register transfers
  - datapath must include storage element for ISA registers
  - · datapath must support each register transfer
- 2. Select set of datapath components and establish clocking methodology
- 3. Assemble datapath meeting requirements
- 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.



Beamer, Summer 2007 © UC