#### Integrated Circuit Design

Iain McNally

 $\approx 12 \ lectures$ 

#### Koushik Maharatna

 $\approx 12 \ lectures$ 

1001

#### Integrated Circuit Design

# Iain McNally

#### • Content

- Introduction
- Overview of Technologies
- Layout
- Design Rules and Abstraction
- Cell Design and Euler Paths
- System Design using Standard Cells
- Pass Transistor Circuits
- Storage
- PLAs
- Wider View

#### 

A Circuits and Systems Perspective Neil Weste & David Harris

#### Addison-Wesley 2004 • Notes & Resources

http://users.ecs.soton.ac.uk/bim/notes/icd

1003

#### History

| 1947 First Transistor<br>John Bardeen, Walter Brattain, and William Shockley (Bell Labs)            |  |
|-----------------------------------------------------------------------------------------------------|--|
| 1952 Integrated Circuits Proposed<br>Geoffrey Dummer (Royal Radar Establishment) - prototype failed |  |
| 1958 First Integrated Circuit<br>Jack Kilby (Texas Instruments) - Co-inventor                       |  |
| <b>1959</b> First Planar Integrated Circuit<br>Robert Noyce (Fairchild) - <i>Co-inventor</i>        |  |
| 1961 First Commercial ICs<br>Simple logic functions from TI and Fairchild                           |  |
| 1965 Moore's Law<br>Gordon Moore (Fairchild) observes the trends in integration.                    |  |

History

Moore's Law

Predicts exponential growth in the number of components per chip.

#### 1965 - 1975 Doubling Every Year

In 1965 Gordon Moore observed that the number of components per chip had doubled every year since 1959 and predicted that the trend would continue through to 1975.

Moore describes his initial growth predictions as "ridiculously precise".

#### 1975 - 2010 Doubling Every Two Years

In 1975 Moore revised growth predictions to doubling every two years.

Growth would now depend only on process improvements rather than on more efficient packing of components.

In 2000 he predicted that the growth would continue at the same rate for another 10-15 years before slowing due to physical limits.

1005

#### History

Moore's Law; a Self-fulfilling Prophesy

The whole industry uses the Moore's Law curve to plan new fabrication facilities.

Slower - wasted investment

Must keep up with the Joneses<sup>2</sup>.

Faster - too costly

Cost of capital equipment to build ICs doubles approximately every 4 years.

<sup>2</sup>or the Intels

1007

History



<sup>1</sup>Intel was founded by Gordon Moore and Robert Noyce from Fairchild

# 1947 Point Contact transistor 1961 Fairchild Bipolar RTL RS Flip-Flop (4 Transistors) (4 Transistors)

1965 Moore's Law (Mk I) Number of transistor has doubled every year and will continue to do so until 1975

1975 Moore's Law (Mk II) Number of transistors will double every two years



1000

#### Overview of Technologies



Overview of Technologies



Overview of Technologies

#### RTL Inverter and NOR gate



#### Overview of Technologies



- TTL gives faster switching than RTL at the expense of greater complexity<sup>2</sup>. The characteristic multi-emitter transistor reduces the overall component count.
- ECL is a very high speed, high power, non-saturating technology.

<sup>&</sup>lt;sup>2</sup>Most TTL families are more complex than the basic version shown here

#### Overview of Technologies

#### NMOS - a VLSI technology.



- Circuit function determined by series/parallel combination of devices.
- Depletion transistor acts as non-linear load resistor.

Resistance increases as the enhancement device turns on, thus reducing power consumption.

• The low output voltage is determined by the size ratio of the devices.

#### 2005

#### Digital CMOS Circuits

#### Alternative representations for CMOS transistors



Various shorthands are used for simplifying CMOS circuit diagrams.

- In general substrate connections are not drawn where they connect to Vdd (PMOS) and Gnd (NMOS).
- All CMOS devices are enhancement mode.
- Transistors act as simple digitally controlled switches.

2007

#### Overview of Technologies



- An active PMOS device complements the NMOS device giving:
  - rail to rail output swing.
  - negligible static power consumption.

#### Digital CMOS Circuits

#### Static CMOS complementary gates



• For any set of inputs there will exist either a path to Vdd or a path to Gnd.

# Digital CMOS Circuits

#### Compound Gates



- All compound gates are inverting.
- Realisable functions are arbitrary AND/OR expressions with inverted output.

2009

#### Digital CMOS Circuits





• Bipolar Transitors with Resistors - MSI/LSI

RTL - NOR TTL - NAND ECS - OR/NOR

• MOS Transistors (no resistors) - VLSI

NMOSCMOS - No static power!Both allow construction of NOR, NAND & Compound gate (always inverting)

#### Components for Digital IC Design



3001

#### Components for Digital IC Design

#### Diodes and Bipolar Transistors

NPN Transistor





• Two n-type implants.

# Components for Digital IC Design



Components for Digital IC Design

#### Simple NMOS Transistor

- Active Area mask defines extent of Thick Oxide.
- Polysilicon mask also controls extent of Thin Oxide (alias Gate Oxide).
- N-type implant has no extra mask.
- It is blocked by thick oxide and by polysilicon.
- The implant is *Self Aligned*.
- Substrate connection is to bottom of wafer.
- All substrates to ground.
- Gate connection not above transistor area.
- Design Rule.



#### CMOS Process



3007

#### Components for Digital IC Design

#### NMOS Transistor

Where it is not suitable for substrate connections to be shared, a more complex process is used.

- Five masks must be used to define the transistor:
  - P Well
  - Active Area
  - Polysilicon
  - N+ implant
  - P+ implant
- P Well, for isolation.
- Top *substrate* connection.
- P+/N+ implants produce good *ohmic* contacts.

#### CMOS Process

#### CMOS Inverter

- The process described here is an *N Well process* since it has only an N Well. P Well and Twin Tub processes also exist.
- Note that the P-N junction between chip substrate and N Well will remain reverse biased.

Thus the transistors remain isolated.

- N implant defines NMOS source/drain and PMOS substrate contact.
- P implant defines PMOS source/drain and NMOS substrate contact.







NPN Transistor











Features may be determined by a number of masks e.g. NMOS source drain: ActiveArea AND NOT(NWell OR Poly OR PImplant)

3000

# Photolithography

| Contact/Proximity Mask | Glass Mask<br>Mask Pattern<br>(chrome) | Photoresist Exposure    |
|------------------------|----------------------------------------|-------------------------|
| Silicon Wafer          | 7                                      | Photoresist Development |
| Oxide Growth           |                                        | Oxide Etch              |
|                        | SiO 2                                  |                         |
| Photoresist Deposition | Photoresist<br>(positive)              | Photoresist Strip       |

4001

#### Design Rules

To prevent chip failure, designs must conform to design rules:



• Multi-layer rules



4003

#### Mask Making



• Optical reduction allows narrower line widths.

#### Derivation of Design Rules



# Design Rules



Abstraction - Stick Diagrams



Stick diagrams give us many of the benefits of abstraction:

- Much easier/faster than full mask specification.
- Process independent (valid for any CMOS process).
- Easy to change.

while avoiding some of the problems:

• Optimized layout may be generated much more easily from a stick diagram than from transistor or gate level designs.<sup>1</sup>

<sup>1</sup>note that all IC designs must end at the mask level.

4007

#### Abstraction



- Mask Level Design
- Laborious Technology/Process dependent.
- Design rules may change during a design!
- Transistor Level Design
- Process independent, Technology dependent.
- Gate Level Design
- Process/Technology independent.



# Digital CMOS Design





4009

#### Digital CMOS Design

#### Stick Diagrams

- Explore your Design Space.
  - Implications of crossovers.
  - Number of contacts.
  - Arrangement of devices and connections.
- Process independent layout.
- Easy to expand to a full layout for a particular process.

# Sticks and CAD - Symbolic Capture



4011

#### Sticks and CAD - Magic



- Log style design (sticks with width) DRC errors are flagged immediately. - again contacts are automatically selected as required.
- On-line DRC leads to rapid generation of correct designs.
- symbolic capture style compaction is available if desired.



A logical approach to gate layout.

• All complementary gates may be designed using a single row of n-transistors above or below a single row of p-transistors, aligned at common gate connections.



5001

#### Digital CMOS Design

#### Euler Path

- For the majority of these gates we can find an arrangement of transistors such that we can butt adjoining transistors.
  - Careful selection of transistor ordering.
  - Careful orientation of transistor source and drain.
- Referred to as line of diffusion.



# Digital CMOS Design

#### Finding an Euler Path

Computer Algorithms

• It is relatively easy for a computer to consider all possible arrangements of transistors in search of a suitable Euler path. This is not so easy for the human designer.

#### One Human Algorithm

- Find a path which passes through all n-transistors exactly once.
- Express the path in terms of the gate connections.
- Is it possible to follow a similarly labelled path through the p-transistors?
  - Yes you've succeeded.
  - No try again (you may like to try a p path first this time).

5003

#### Digital CMOS Design



Here there are four possible Euler paths.



5005

# Digital CMOS Design



No possible path through n-transistors!

5007

# Digital CMOS Design



1. Find Euler path 3. Route power nodes 5. Route remaining nodes 2. Label poly columns 4. Route output node 6. Add taps<sup>1</sup> for PMOS and NMOS A combined contact and tap,  $\blacksquare$ , may be used only where a power contact exists at the end of a line of diffusion. Where this is not the case a simple tap, -, should be used.

<sup>1</sup>1 tap is good for about 6 transistors – insufficient taps may leave a chip vulnerable to latch-up

Digital CMOS Design

B





No possible path through p-transistors. No re-arrangement will create a solution!





Investigation of Euler paths leads to more efficient layout\*

5000



#### 6001

#### Digital CMOS Design

Multiple gates

- Gates should all be of same height.
  - Power and ground rails will then line up when butted.
- All gate inputs and outputs are available at top and bottom.
  - All routing is external to cells.
  - Preserves the benefits of hierarchy.
- Interconnect is via two conductor routing.
  - In this case Polysilicon vertically and Metal horizontally.

# Digital CMOS Design



Most modern VLSI processes support two or more metal layers. The norm is to use only metal for inter-cell routing. Usually Metall horizontally (and for power rails) and Metal2 vertically (and for cell inputs and outputs).

6003

# Standard Cell Design

Many ICs are designed using the standard cell method.

#### • Cell Library Creation

A cell library, containing commonly used logic gates, is created for a process. This is often carried out by or on behalf of the foundry.

• ASIC<sup>1</sup> Design

The ASIC designer must design a circuit using the logic gates available in the library.

The ASIC designer usually has no access to the full layout of the standard cells and doesn't create any new cells for the library.

Layout work performed by the ASIC designer is divided into two stages:

- Placement
- Routing

<sup>&</sup>lt;sup>1</sup>Application Specific Integrated Circuit

#### Placement & Routing



Cells are placed in one or several equal length lines with inter-digitated power and ground rails.

6005

#### Placement & Routing





In the routing channels between the cells we route metal1 horizontally and metal2 vertically.

#### Placement & Routing

#### Two conductor routing

• This logical approach means that we should never have to worry about signals crossing.

This makes life considerably easier for a computer (or even a human) to complete the routing.

- We must only ensure that two signals will not meet in the same horizontal or vertical channel.
- $\bullet$  Computer algorithms can be used to ensure placement of cells such that wires are short.^2
- Further computer algorithms can be used to optimize the routing itself.

 $^{\rm 2} In$  VLSI circuits we often find that inter-cell wiring occupies more area than the cells themselves.

6007

#### Standard Cell Design

#### More Metal Layers

With three or more metal layers it is possible to take a different approach. The simplest example uses three metal layers.

Standard Cells

Use only metal1 except for I/O which is in metal2

• Two Conductor Routing Uses metal2 and metal3



#### Standard Cell Design



With this approach we can route safely over the cell to the specified pins leading to much smaller gaps between cell rows.

6009

#### Standard Cell Design

Alternative Placement Style



By flipping every second row it may be possible to eliminate gaps between rows. N-wells are merged and power or ground rails are shared.

This approach is normally associated with sparse rows and non channel based routing algorithms.











#### Static CMOS Complementary Gates



#### • Static

After the appropriate propagation delay the ouput becomes valid and remains valid.  $^{\rm 1}$ 

• Complementary

For any set of inputs there will exist either a path to Vdd or a path to GND.

Where this condition is not met we have either a high impedence output or a conflict in which the strongest path succeeds. Static CMOS **Non-complementary** gates make use of these possibilities.

<sup>1</sup>c.f. Dynamic logic which uses circuit capicitance to store state for a short time.

7001

#### Pass Transistor Circuits

- Transmission Gate
  - For static circuits we would normally use a CMOS transmission gates:



- - balanced n and p pass transistors
- -- faster pull-up
- - slower pull-down

7003

#### Pass Transistor Circuits

Pass Transistor

- Provides very compact circuits.
- Good transmission of logic '0'.
- Poor transmission of logic '1'.
- - slow rise time
- - degradation of logic value

The pass transistor is used in many dynamic CMOS circuits<sup>2</sup>.

#### Pass Transistor Circuits

• Transmission Gate Layout



note that these circuits are not fully complementary<sup>3</sup> hence they do not immediately lend themselves to a *line of diffusion* implementation.

<sup>3</sup>since there are sets of inputs for which the output is neither pulled low nor high

<sup>&</sup>lt;sup>2</sup>where pull-up is performed by an alternative method

#### Pass Transistor Circuits

• Transmission Gate Multiplexor



7005

#### Bus Distributed Multiplexing



Ideal for signals with many drivers from different modules.

7007

#### Pass Transistor Circuits





- distributed multiplexing<sup>4</sup>
- only one inverter required per bank of transmission gates
- greatly simplifies global wiring

#### **Bus Distributed Multiplexing**



- Separate circuit for each function
- Connected via distributed multiplexor

<sup>&</sup>lt;sup>4</sup>internal chip bus should never be allowed to float high impedance

<sup>&</sup>lt;sup>5</sup>Note that transmission gates have no drive capability in themselves. Here a good drive is ensured by providing buffers.

#### Bus Distributed Multiplexing



• Multiplexing is not distributed

• Multiplexor implementation may use transmission gates

7009

# Pass Transistor Circuits

• Tristate Inverter



- Alternatively the transmission gate may be incorporated into the gate.
- - one connection is removed easier to layout
- - also easier to simulate!

7011

# Pass Transistor Circuits

#### • Tristate Inverter



- Any gate may have a tri-state output by combining it with a transmission gate.

#### Pass Transistor Circuits

• Tristate Inverter Layout



Pass Transistor Circuits

• Tristate Inverter Bus Driver



- a tristate inverting buffer is often used to drive high capacitance bus signals
- transistors may be sized as required





#### Latches and Flip-Flops

• CMOS transmission gate latch



A simple transparent latch can be build around a transmission gate multiplexor

- transparent when load is high
- latched when load is low
- two inverters are required since the transmission gate cannot drive itself

8001

Latches and Flip-Flops

• Transmission gate latch layout



- a compact layout is possible using 2 layer metal

- Latches and Flip-Flops
  - A simpler layout may be achieved using tristate inverters.



 this design requires two additional transistors but may well be more compact.

8003

#### Latches and Flip-Flops

• For use in simple synchronous circuits we use a pair of latches in a master slave configuration.



- this avoids the race condition in which a transparent latch drives a second transparent latch operating on the same clock phase.
- the circuit behaves as a rising edge triggered D type flip-flop.

#### Latches and Flip-Flops

• Transmission gate implementation



• Tristate inverter implementation



8005

# Latches and Flip-Flops

• Layout of master slave D type.



- very compact using alternative configuration.

8007

#### Latches and Flip-Flops

• Alternative configuration







#### Latches and Flip-Flops

- simpler clock distribution

• For the same functionality we could use an edge triggered D type:



# **Register** File

Where we have large amounts of storage the use of individual latches can lead to space saving.



• Load signals must be glitch free with tightly controlled timing.

• Edge Triggered D-type prevents a race condition ( $Reg1 \leftarrow Reg1 + Reg2$ ).





Euler path analysis is applied creatively to these multi-gate cells - gates are often linked via the common gnd/pwr node Final layouts will be more complex where clock buffers, reset circuitry and metal 2 i/o are included

8000

PLA structures

Programmable Logic Array structures provide a logical and compact method of implementing multiple SOP (Sum of Products) or POS expressions.



Most PLA structures employ pseudo-NMOS NOR gates using a P-channel device in place of the NMOS depletion load.

9001

PLAs, ROMs and RAMs





- Unlike complementary CMOS circuits, these gates will dissipate power under static conditions (since the P device is always on).
- The P and N channel devices must be ratioed in order to create the required low output voltage.
- This ratioing results in a slower gate, although there is a trade-off between gate speed and static power dissipation.

#### PLAs, ROMs and RAMs



• A regular layout is employed, with columns for inputs and outputs and rows for intermediate expressions.

9003

#### PLAs, ROMs and RAMs



• Layout is simply a matter of selecting and placing rectangular cells from a limited set.

#### PLAs, ROMs and RAMs



• Conversion to *sticks* is straight forward with opportunities for further optimization.

9005

#### PLAs, ROMs and RAMs

#### Static RAM

- Used for high density storage on a standard CMOS process.
- Short lived conflict during write NMOS transistors offer stronger path.
- Differential amplifiers are used for speedy read.



Standard 6 transistor static RAM cell.

9007

#### PLAs, ROMs and RAMs



#### SRAM Structure





Cells are designed to butt together in two dimensions leading to efficient layout

PLA layout efficiency will depend on the actual function implemented (e.g. number of common product terms)

#### System Design Choices

- Programmable Logic
  - PLD
  - e.g. PAL 22V10, ICT PEEL22CV10, Lattice ispGAL22V10
  - Field Programmable Gate Array (FPGA)
  - e.g. Xilinx XC4013, Altera Cyclone EP1C12
- Semi-Custom Design
  - Mask Programmable Gate Array
  - e.g. ECS CMOS Gate Array
    - Altera HardCopy II structured ASICs
  - Standard Cell Design
  - e.g. Alcatel Mietec MTC45000 $0.35 \mu m$  cell library
- Full Custom Design

10001

# Programmable Logic



- One time use Fuse programmable.
- Reprogrammable UV/Electrically Erasable.

10003

#### System Design Choices

• Programmable Logic

START HERE

- Best possible design turnaround time
- Cheapest for prototyping
- Best time to market
- Minimum skill required
- Semi-Custom Design
- Full Custom Design
  - Cheapest for mass production
  - Fastest
  - Lowest Power
  - Highest Density  $^1$
  - Most skill required

#### Field Programmable Gate Array – Xilinx XC4000



- Configurable Logic Blocks & I/O Blocks<sup>2</sup>
- Programmable Interconnect

 $^2 Xilinx$  XC4013 has 576 (24  $\times$  24) CLBs and up to 192 (4  $\times$  48) user I/O pins.

<sup>&</sup>lt;sup>1</sup>optimization limited by speed/power/area trade off

#### Field Programmable Gate Array – Altera Cyclone



- Logic Array Blocks, M4K Ram Blocks & I/O Elements<sup>3</sup>
- Programmable Interconnect

<sup>3</sup>Altera Cyclone EP1C12 has 12060 Logic Elements (arranged as 1206 Logic Array Blocks) and and up to 249 user I/O pins.

10004

# Field Programmable Gate Array – Altera Cyclone LE



#### Field Programmable Gate Array - Xilinx XC4000 CLB



#### Mask Programmable Gate Array



#### Mask Programmable Gate Array



- Customize Metal and Contact Window masks only.
  - 10007

#### Standard Cell Design

• Logic Functions



- Auto Generated Macro Blocks
  - PLA
  - ROM
  - RAM
- System Level Blocks

- Microprocessor core4

<sup>4</sup>Will support System On Chip applications.

#### Full Custom

All design styles need full custom designers

- to design the base programmable logic chips
- to design building blocks for semi-custom

Where large ASICs use full custom techniques they are likely to be used alongside semi-custom techniques.

e.g. Hand-held computer game chip

- Full custom bitslice datapath hand crafted for optimum area efficiency and low power consumption
- Standard cell controller
- Macro block RAM, ROM

10009



All design styles need full custom designers

A large ASIC (especially SoC) may mix Semi-Custom and Full Custom