

EECS 151/251A
Spring 2020
Digital Design and
Integrated Circuits

Instructor:
John Wawrzynek

Lecture 12: Timing part 2

#### **Announcements**

- □ Virtual Front Row for today 2/25:
  - Naomi Sagan
  - □ Peter Trost
  - □ Jiefeng Chen
  - □ Rajiv Govindjee
  - James Shi
- Questions/comments used in class participation points.
- □ Homework assignment 5 posted due Monday. Explanations are important!

# Modeling Gate Delay

#### InverterTransient Response

#### With:

resistive approximation for FETs, high-to-low (HL)

$$V(t) = V_0 e^{-t/RC}$$
$$t_{1/2} = ln(2) \times RC$$



 $t_{pHL} = f(R_{on}C_L)$  $= 0.69 R_n C_L$ 

#### The Switch – Dynamic Model (Simplified)



## Switch Sizing

What happens if we make a MOSFET W times larger (wider)?



#### Switch Parasitic Model

#### The pull-down switch (NMOS)



Minimum-size switch



Sizing the transistor (factor *W*)

We assume transistors of minimal length (or at least constant length). R's and C's in units of per unit width.

#### pFET Switch Parasitic Model

For traditional CMOS processes, pFETs are ~twice as resistive as nFETS. (Mobility of holes is 1/2 that of electrons).

#### The pull-up switch (PMOS)



Minimum-size switch

Sized for symmetry

General sizing

#### "Balanced" Inverter Parasitic Model



### Inverter with Load Capacitance



$$t_{p} = 0.69(R_{N}/W)(C_{int} + C_{L})$$

$$= 0.69(R_{N}/W)(3W\gamma C_{G} + C_{L})$$

$$factor \ out \ 3W\gamma C_{G}$$

$$replace \ C_{int} = 3WC_{G}$$

$$= 0.69(3\gamma R_{N}C_{G})(1 + \frac{C_{L}}{\gamma C_{in}})$$

$$= t_{p0}(1 + \frac{C_{L}}{\gamma C_{in}}) = t_{p0}(1 + f/\gamma)$$

f = fanout = ratio of load capacitance ( $C_L$ ) to and input capacitance ( $C_{in}$ )

## Inverter Delay Model

Delay linearly proportional to fanout, f. For f=0, delay is intrinsic inverter delay  $t_{inv}=t_{p0}$ 

$$t_p = t_{p0}(1 + f/\gamma)$$



Question: how does transistor sizing (W) impact delay?

### 2-input NAND Gate

- □ Let's derive the formula for gate propagation as a function of fanout, f (as with inverter)
- □ We derive the equations based on the input connected to the transistor closest to the output (A),
- assuming the B input had been set to 1 (for a long time)
- □ So we can fairly compare to the inverter, size the transistors so that the capacitance of each input is equivalent to the input capacitance of the inverter
- □ Assume that the resistance of the pFET is twice that of the nFET (Rp = 2Rn) if the pFET and nFET have the same width
- □ Size the transistors so that the rise time and fall times are equivalent
- □ For the 2 transistors in series, ignore the capacitance at their shared node



## 2-input NAND Gate

#### □ Solution setup:

- □ To keep the pullup and pulldown delays the same, Wp = Wn, because Rp = 2Rn.
- □ Remember, inverter had input C of 3C<sub>G</sub>, with Wp = 2Wn.
- ☐ Therefore, here we increase widths by 3/2 relative to the inverter, so
- □ R changes by 2/3 (shown in figure).



## 2-input NAND Gate



# 2-input NOR Gate

$$C_{int} = 2 \times (3/5/C_D + (12/5)C_D = (18/5)C_D$$

$$t_p = 0.69 \left(\frac{3R_N}{5W}\right) (C_{int} + C_L)$$

$$= 0.69 \left(\frac{5R_N}{3W}\right) \left(\frac{18}{5} \gamma W C_G + C_L\right)$$

$$= 0.69 \left(\frac{R_N}{W}\right) \left(6 \gamma W C_G + \frac{5C_L}{3}\right)$$

$$= 0.69 \left(\frac{R_N}{W}\right) 3 \gamma W C_G \left(2 + \frac{5C_L}{3(3\gamma W C_G)}\right)$$

$$= 0.69 \cdot 3R_N \gamma C_G \left[2 + \frac{5C_L}{3\gamma C_{IN}}\right]$$

$$= t_{p0} \left(2 + \frac{5f}{3\gamma}\right)$$

## Gate Delay Summary



The y-intercepts (intrinsic delay) for NAND and NOR are both twice that of the inverter. The NAND line has a gradient 4/3 that of the inverter (steeper); for NOR it is 5/3 (steepest).

What about gates with more than 2-inputs?

#### Look at 4-input NAND:

$$t_p = t_{p0} \left( 4 + \frac{2f}{\gamma} \right)$$
 slope

# Adding Wires to gate delay

Wires have finite resistance, so have distributed R and C:

with r = res/length, c = cap/length,  $\Delta t \propto rcL^2 \cong rc + 2rc + 3rc + ...$ 





Wire propagation delay is around half of what it would be if R and C were "lumped":  $t_p = 0.38(rL * cL) = 0.38rcL^2$ 

# Gate Driving long wire and other gates



$$R_w = r_w L, \quad C_w = c_w L$$

$$t_p = 0.69R_{dr}C_{int} + 0.69R_{dr}C_w + 0.38R_wC_w + 0.69R_{dr}C_{fan} + 0.69R_wC_{fan}$$
$$= 0.69R_{dr}(C_{int} + C_{fan}) + 0.69(R_{dr}C_w + r_wC_{fan})L + 0.38r_wc_wL^2$$

# **Priving Large Loads**

Large fanout nets: clocks, resets, memory bit lines, off-chip

Relatively small driver results in long rise time (and thus

large gate delay)

Strategy:



Staged Buffers



- How to design for optimal performance (least delay)?
- Should be obvious that total delay is minimized with equal delay at each stage.

# **Driving Large Loads**



- $\Box$  For some given  $C_L$ :
  - How many stages are needed to minimize delay?
  - How to size the inverters?
- Get smallest delay if build one very big inverter
  - So big that delay is set only by self-loading



- □ Not an interesting solution. Why?
  - Something has to drive this inverter (a big inverter has a large input capacitance!) ...

# **Delay Optimization**

- $\begin{array}{c|c}
  In & Out \\
  \hline
   & C_I
  \end{array}$
- □ First assume given:
  - A <u>fixed</u> number of inverters
  - The size\* of the first inverter
  - The size of the load that needs to be driven
- □ What is the minimal delay of the inverter chain?

<sup>\*</sup> note: When we talk about inverter (or gate) "size", we refer to the wide of the transistors making up the circuit.



□ Delay for the *j-th* inverter stage:

$$t_{p,j} = t_{p0} \left( 1 + \frac{C_{g,j+1}}{\gamma C_{g,j}} \right) = t_{p0} (1 + f_j/\gamma)$$

□ Total delay of the chain:

$$t_p = \sum_{j=1}^{N} t_{p,j} = t_{p0} \sum_{j=1}^{N} \left( 1 + \frac{C_{g,j+1}}{\gamma C_{g,j}} \right), \qquad C_{g,N+1} = C_L$$

#### Optimum Delay and Number of Stages

- □ Each inverter should be sized up by the same factor f with respect to the preceding inverter
- □ Therefore each stage has the equal delay
- $\Box$  Given  $C_{g,I}$  and  $C_L$

$$f = \sqrt[N]{C_L/C_{g,1}} = \sqrt[N]{F}$$

- □ Where *F* represents the overall fan-out of the circuit
- □ Therefore the minimal delay through the chain is:

$$t_p = N \cdot t_{p0} (1 + \sqrt[N]{F}/\gamma)$$

#### **Example**



 $C_L/C_1$  has to be evenly distributed across N=3 stages:

$$f = \sqrt[N]{C_L/C_{g,1}} = \sqrt[N]{F} = \sqrt[3]{8} = 2$$

## **Delay Optimization**

- □ Now assume given:
  - The size of the first inverter
  - The size of the load that needs to be driven
- Minimize delay by finding optimal number and sizes of inverters
- □ So, need to find N that minimizes:

$$t_p = N \cdot t_{p0} (1 + \sqrt[N]{F}/\gamma), \quad F = C_L/C_{g,1}$$

## Finding optimal fanout per stage

$$t_p = N \cdot t_{p0} (1 + \sqrt[N]{F}/\gamma), \quad F = C_L/C_{g,1}$$

□ Differentiate w.r.t. N and set = 0:

$$\gamma + \sqrt[N]{F} - \frac{\sqrt[N]{F} \ln F}{N} = 0$$

$$\Rightarrow f = e^{(1+\gamma/f)}$$

 $\Box$  Closed form only if :  $\gamma = 0 \Rightarrow N = ln(F), f = e$ 

#### Optimum Effective Fanout f

 $\Box$  Optimum f for given process defined by  $\gamma$ 



## In Practice: Plot of Total Delay



[Hodges, p.281]

- Why the shape?
- $\square$  Curves very flat for f > 2
  - Simplest/most common choice: f = 4

#### **Transistor Sizing in Logic Circuits**

- □ Similar optimization challenges exist within all combinational logic blocks. How do we size transistors to minimize a given circuit?
- □ ASIC standard cell libraries include cells with various output drive strength (transistor sizes)
- □ Tools will automatically choose the proper size and/or add buffers to minimize critical path logic delay
- □ Hand methods exist for minimizing logic path delay:





#### **End of lecture 12**