# ESE534: Computer Organization

Day 15: March 14, 2012
Interconnect 2: Wiring
Requirements and Implications

Penn ESE534 Spring2012 -- DeHon



# Previously

- · Identified need for Interconnect
- Seen that interconnect can be expensive
- Identified need to understand/exploit structure in our interconnect design

Penn ESE534 Spring2012 -- DeHon

2

# Today

- · Wiring Requirements
- · Rent's Rule
  - A model of structure
- · Implications

enn ESE534 Spring2012 -- DeHon

3

5

### Wires and VLSI

- · Simple VLSI model
  - Gates have fixed size (A<sub>gate</sub>)
  - Wires have finite spacing (W<sub>wire</sub>)
  - Have a small, finite number of wiring layers
    - E.g.
      - -one for horizontal wiring
      - -one for vertical wiring
- -Assume wires can run over gates

4

### Visually: Wires and VLSI



Penn FSF534 Spring2012 -- DeHon

### Preclass 1

- How many 40F×40F gates in 25,000F×25,000F region?
- · How many wires can go in and out?
- · Ratio?

Penn ESE534 Spring2012 -- DeHon

### Important Consequence

- · A set of wires
- · crossing a line
- · take up space:

$$W = (N \times W_{wire}) / N_{layers}$$

Penn ESE534 Spring2012 -- DeHon

### Thompson's Argument

- The minimum area of a VLSI component is bounded by the larger of:
  - The area to hold all the gates
    - $A_{chip} \ge N \times A_{gate}$
  - The area required by the wiring
    - $A_{chip} \ge N_{horizontal} W_{wire} \times N_{vertical} W_{wire}$

Penn ESE534 Spring2012 -- DeHon

8

# How many wires?

- We can get a lower bound on the total number of horizontal (vertical) wires by considering the bisection of the computational graph:
  - Cut the graph of gates in half
  - Minimize connections between halves
  - Count number of connections in cut
  - Gives a lower bound on number of wires

Penn ESE534 Spring2012 -- DeHon



### **Next Question**

- In general, if we:
  - Cut design in half
  - Minimizing cut wires
- How many wires will be in the bisection?

N/2 cutsize

enn ESE534 Spring2012 -- DeHor

# **Arbitrary Graph**

- Graph with N nodes
- · Cut in half
  - N/2 gates on each side
- · Worst-case?
  - Every gate output on each side
  - Is used somewhere on other side
  - Cut contains N wires

Penn ESE534 Spring2012 -- DeHon

# **Arbitrary Graph**

- For a random graph
  - Something proportional to this is likely
- · That is:
  - Given a random graph with N nodes
  - The number of wires in the bisection is likely to be: cxN

Penn ESE534 Spring2012 -- DeHon

13

# Particular Computational Graphs

- Some important computations have exactly this property
  - FFT (Fast Fourier Transform)
  - Sorting

Penn ESE534 Spring2012 -- DeHon

14



### **FFT**

- Can implement with N/2 nodes
  - Group row together
- Any bisection will cut N/2 wire bundles
  - True for any reordering



Penn ESE534 Spring2012 -- DeHon

# Assembling what we know

- $A_{chip} \ge N \times A_{gate}$
- $A_{chip} \ge N_{horizontal} W_{wire} \times N_{vertical} W_{wire}$
- $N_{horizontal} = c \times N$
- $N_{\text{vertical}} = c \times N$ 
  - -[bound true recursively in graph]
- $A_{chip} \ge cN W_{wire} \times cN W_{wire}$

Penn ESE534 Spring2012 -- DeHon

17

# Assembling ...

- $A_{chip} \ge N \times A_{qate}$
- $A_{chip} \ge cN W_{wire} \times cN W_{wire}$
- $A_{chip} \ge (cN W_{wire})^2$
- $A_{chip} \ge N^2 \times C'$

Penn ESE534 Spring2012 -- DeHon

### Result

- $A_{chip} \ge N \times A_{gate}$
- $A_{chip} \ge N^2 \times C'$
- · Wire area grows faster than gate area
- Wire area grows with the square of gate area
- For sufficiently large N,
  - -Wire area dominates gate area

Penn ESE534 Spring2012 -- DeHon

19

### Preclass 2

 How does ratio change for 100,000 F×100,000 F region?

Penn ESE534 Spring2012 -- DeHon

20

### Intuitive Version

- · Consider a region of a chip
- Gate capacity in the region goes as area (s²)
- Wiring capacity into region goes as perimeter (4s)
- Perimeter grows more slowly than area
  - Wire capacity saturates before gate

Penn ESE534 Spring2012 -- DeHon

### Result

- $A_{chip} \ge N^2 \times C'$
- Wire area grows with the square of gate area
- Troubling:
  - To **double** the size of our computation
  - -Must **quadruple** the size of our chip!

Penn ESE534 Spring2012 -- DeHon

22

### So what?

What do we do with this observation?

Penn ESE534 Spring2012 -- DeHon

23

### First Observation

- Not all designs have this large of a bisection
- What is typical?

Penn ESE534 Spring2012 -- DeHon

# **Favorite Design Elements**

- · What are your favorite computing design elements?
- · What are the bisection bandwidths for these elements?

enn ESE534 Spring2012 -- DeHon

25





### Architecture ⇔ Structure

- Typical architecture trick:
  - exploit expected problem structure
- · What structure do we have?
- · Impact on resources required?

nn ESE534 Spring2012 -- DeHon

### **Bisection Bandwidth**

- · Bisection bandwidth of design
  - →lower bound on wire crossings
  - important, first order property of a design.
  - Measure to characterize
    - Rather than assume worst case
- Design with more locality
  - → lower bisection bandwidth
- · Enough?



enn ESE534 Spring2012 -- DeHon

# **Characterizing Locality**

- Single cut does not capture locality within halves
- Cut again



# Regularizing Growth

- How do bisection bandwidths shrink (grow) at different levels of bisection hierarchy?
- · Basic assumption: Geometric
  - \_ 1
  - **–** 1/α
  - $-1/\alpha^2$

Penn ESE534 Spring2012 -- DeHon

31





# Rent's Rule

• In the world of circuit design, an empirical relationship to capture:

$$IO = c N^p$$

- 0≤p≤1
- p characterizes interconnect richness
- Typical: 0.5≤p≤0.7
- "High-Speed" Logic p=0.67

Penn ESE534 Spring2012 -- DeHon

34

# Rent and Locality • Rent and IO quantifying locality – local consumption – local fanout



### As a function of Bisection

- $A_{chip} \ge N \times A_{gate}$
- $A_{chip} \ge N_{horizontal} W_{wire} \times N_{vertical} W_{wire}$
- $N_{horizontal} = N_{vertical} = IO = cN^p$
- $A_{chip} \ge (cN)^{2p}$
- If p<0.5

$$A_{chip} \propto N$$

• If p>0.5

nn ESE534 Spring2012 -- DeHon

$$A_{chip} \propto N^{2p}$$

37

### In terms of Rent's Rule

- $A_{chip} \propto N$ • If p<0.5,
- $A_{chip} \propto N^{2p}$ • If p>0.5,
- Typical designs have p>0.5
  - → interconnect dominates

nn ESE534 Spring2012 -- DeHon

38

40

### What tell us about design?

- · Recursive bandwidth requirements in network
  - lower bound on resource requirements
- N.B. necessary but not sufficient condition on network design
  - I.e. design must also be able to use the wires

nn ESE534 Spring2012 -- DeHon

39

# Capacity Impact

- Rent: IO=C\*N<sup>p</sup>
- p>0.5
- A= C\*N<sup>2p</sup>
- N=(A/C)<sup>(1/2p)</sup>
- Logical Area ∝(1/S)<sup>2</sup>
- N'=(((1/S) $^{2}$ A)/C)(1/2p)
- N'=(A/C)<sup>(1/2p)</sup> ×((1/S)<sup>2</sup>)<sup>(1/2p)</sup>
- N'=N ×( $(1/S)^2$ )(1/2p)
- N'=N ×(1/S)(1/p)

ın ESE534 Spring2012 -- DeHon

· Sanity Check

- p=1
- $-N_2 = N/S$
- p~0.5
- $-N_2 \sim N/S^2$

# What tell us about design?

- · Interconnect lengths
  - Intuition
    - if p>0.5, everything cannot be nearest neighbor
    - · as p grows, so wire distances



Can think of p as dimensionallity: p=1-1/d

41

# Preclass 5

- 24,000 F side, 40F × 40 F gates
- · Wire length?











# Preclass 6 • How many gates reachable with 800F of wiring? • How many gates reachable with 1600F wiring?



### Preclass 7

- · Depth 20 circuit, 2-input gates
  - Maximum number of gates?
    - Topology?
  - Minimum distance?
  - Lower bound maximum wire length?
- · Depth 24 circuit
  - Lower bound maximum length?

Penn ESE534 Spring2012 -- DeHon

49

# "Closeness" • Try placing "everything" close $\frac{\text{Manhattan Distance Places Fanin}}{1} \frac{1}{4} \frac{4}{4} \frac{4}{2} \frac{2}{8} \frac{8}{16} \frac{1}{3} \frac{12}{12} \frac{64}{4^n}$ $\frac{3}{12} \frac{1}{12} \frac{1}{12}$

### Rent's Rule Caveats

- Modern "systems" on a chip -- likely to contain subcomponents of varying Rent complexity
- Less I/O at certain "natural" boundaries
- · System close
  - Rent's Rule apply to workstation, PC, MP3 player, Smart Phone?

Penn ESE534 Spring2012 -- DeHon

51

# Area/Wire Length

- · Bad news
  - Area ~  $\Omega(N^{2p})$ 
    - faster than N
  - Avg. Wire Length ~  $\Omega$  (N<sup>(p-0.5)</sup>)
    - grows with N
- Can designers/CAD control p (locality) once appreciate its effects?
- I.e. maybe this cost changes design style/criteria so we mitigate effects?

Penn ESE534 Spring2012 -- DeHon

--

### What Rent didn't tell us

- · Bisection bandwidth purely geometrical
- · No constraint for delay
  - I.e. a partition may leave critical path weaving between halves

Penn ESE534 Spring2012 -- DeHon

53

# Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long wires in critical path → increase cut size. Penn ESE534 Spring2012 - DeHon

# **Original Memo**

- Recent Issue (Winter 2010, v2n1) of IEEE Solid-State Circuits Magazine
- Retrospect on IBM 1401 and E. F. Rent

   Including original memos
- Linked Supplemental Reading



Penn ESE534 Spring2012 - DeHon

eHon FIGURE 5: Single- and double-width SMS cards from the IBM 1401 Processing Uni

### Admin

- HW7 due Monday
- · Reading for Monday on web

Penn ESE534 Spring2012 -- DeHon

56

# Big Ideas [MSB Ideas]

- Rent's rule characterizes locality Fixed wire layers:
  - $\rightarrow$ Area growth  $\Omega$  (N<sup>2p</sup>)
  - $\rightarrow$  Wire Length  $\Omega$  (N<sup>(p-0.5)</sup>)
- p>0.5→ interconnect growing faster than compute elements
  - expect interconnect to dominate other resources

Penn ESE534 Spring2012 -- DeHon