Integrated Circuit
Intrinsic Reliability

Dennis Eaton
Agilent Technologies
dennis_eaton@agilent.com

IEEE Solid-State Circuits Society
February 16, 2005
Outline

• Why reliability is important
• Basic reliability and statistics concepts
• Silicon intrinsic wearout mechanisms
  – Hot Carrier Injection (HCI)
  – Gate Oxide Integrity (GOI)
  – Negative Bias Temperature Instability (NBTI)
  – Electromigration (EM)
  – Stress Migration (SM)
    • Description
    • Testing and specifications
    • Mitigating through circuit design practices
What is Reliability?

Quality is:
- Meeting customer expectations
- Conformance to specification

Reliability is:
- Quality over time
Why Reliability is Important

• Failures are costly—money and time
  – To customer
    • Downtime
    • Repair or replacement expense
    • May lose data
    • May affect many people (e.g. shut down an airport)
  – To producer
    • Warranty costs
    • Losing customers
    • Company reputation
    • Fixing problems
Customer Expectations

• Very few failures in the warranty period
  < 500 parts per million (ppm) failure rate in first year

• Very few failures during the useful life of the product
  < 50 failures per 1,000,000,000 (10^9) device hours (50 FITs)
  If there were 1 million parts in the field for 10 years, this would be 4,380 failures

• Failures due to wearout mechanisms only after useful life of product
  Useful life typically 5 years or 10 years
Statistical Parameters

Parameters of failure distributions

**Probability distribution function**, pdf: \( f(t) \)

**Cumulative distribution function**, CDF: \( F(t) = \int f(t)dt \)

**Reliability** = \( 1-F(t) \)

**Hazard rate**, \( h(t) = f(t)/[1-F(t)] \)

Hazard rate is the pdf, \([f(t)]\), divided by the quantity of parts remaining. \( h(t) \) is the instantaneous probability of a part failing given that the part has lasted to time \( t \)

\( h(t) \) a constant in the middle of the **bathtub curve**

\( h(t) \) is decreasing during the early life portion

\( h(t) \) is increasing during the wearout portion
Failure Rate vs. Operating Time
Bathtub Curve

- **Early Life**
  - Mfg. Defects
  - $h(t)$ decreasing

- **Useful Life**
  - Random defects
  - $h(t)$ constant
  - device degradation occurs throughout operation
  - FIT rate is this failure level

- **Wearout**
  - Intrinsic mechanisms
  - $h(t)$ increasing
  - wearout occurs here
Failure Statistics

Devices on accelerated tests or in the field fail at different times, following a statistical distribution

- Most wearout data fit Lognormal or Weibull
  - Typically want to find the time at which 0.1% of the parts have failed, i.e. \( F(t) = 0.001 \)
    - called \( t_{0.1} \)

- Plot the failure data on appropriate scale to obtain a straight line
  - Determine lognormal sigma or Weibull beta
  - Find \( t_{0.1} \)
Normally Distributed Data Plotted on a Linear Scale

[arbitrary scale for f(t) and h(t)]

Linear plot of normally distributed data: h(t), f(t), F(t)
Mean = 100  sigma = 40
Z vs Time on a Linear Scale for Normally Distributed Data

Linear plot of the standard normal variable (Z) $\mu = 100, \sigma = 40$

$Z = (t - \mu)/\sigma$. Z represents the number of $\sigma$’s away from the mean. For example $Z = 1$ corresponds to $t=140$ ($\mu + 1\sigma$)
Normally Distributed Data Plotted on Normal Probability Paper

mean = 100, sigma = 40
linear in x; normal probability in y. CDF is a straight line

Unreliability, F(t)

Time, (t)
Weibull Distribution

Flexible distribution that fits a variety of data

\[ f(t) = \frac{m}{c} \left(\frac{t}{c}\right)^{m-1} \exp\left(\frac{t}{c}\right)^m \]

\[ F(t) = 1 - \exp\left(\frac{t}{c}\right)^m \]

\[ h(t) = \frac{m}{c} \left(\frac{t}{c}\right)^{m-1} \]

- \( m \) = shape parameter or slope (also called \( \beta \))
- \( c \) = characteristic time (also called \( \eta \))
  - \( c \) is the time at which 63.2% of the sample has failed

if \( m < 1 \) the hazard rate is decreasing
if \( m > 1 \) the hazard rate is increasing (good for modeling wearout)
if \( m = 1 \) the hazard rate is constant (exponential distribution)
if \( m = 3.6 \), Weibull approximates the normal distribution
Weibull Distributed Data Plotted on a Linear Scale

[f(t) and h(t) not to scale]

Linear-linear plot of Weibull distributed data  h(t), f(t), F(t)

\[ m = 2.5 \quad c = 100 \]
Weibull distributed data plotted on Weibull probability paper \( c = 100, m = 2.5 \)

\( \ln \) in \( x \); Weibull probability in \( y \). CDF is a straight line
Wearout Mechanism Accelerated Testing

- Uses specialized test devices
- Many test devices are put in accelerated tests under specific stress conditions
  - Increased temperature
  - Increased voltage
  - Increased current
- **Accelerated testing compresses time**
  - 100 hours at accelerated conditions may be equivalent to 10,000 hours at use conditions
Acceleration Models

• Many wearout mechanisms have a temperature dependence of

\[
\text{Time to failure (TTF)} \sim e^{(E_a/kT)}
\]

\[
E_a = \text{activation energy (eV)}
\]

\[
k = \text{Boltzmann’s constant}
\]

\[
T = \text{absolute temperature}
\]

called the Arrhenius equation

(the number e is very important in figuring out equations)
Intrinsic Silicon Wearout Mechanisms

- Hot Carrier Injection (HCl)
- Gate oxide integrity (GOI)
- Negative Bias Temperature Instability (NBTI)
- Electromigration (EM)
- Stress Migration (SM)
What is Hot Carrier Injection - 1

“Hot” means the carriers (electrons and holes) have high energy –
Electrons have high energy by the time they reach the drain
• Impact ionization can cause both electrons and holes to go into the gate oxide and be captured by traps in the oxide

from A. Sabnis, IRPS Tutorial, 1986
What is Hot Carrier Injection - 2

• Electrons (and holes) have enough energy to be injected into traps in the gate oxide (energetic = hot)

• Trapped electrons lower the gate oxide field, which
  – Raises threshold voltage (Vt)
  – Lowers transconductance (Gm)
  – Lowers drain current in the FET linear region (Idlin) and saturation region (Idsat)

Result is: Drive current of the FET is lowered
Effect of HCI Degradation on FETs

Both the linear and saturation current decrease

The NMOS FET degrades much more than the PMOS FET for the same amount of stressing

From J.E. Chung and P. Fang, IRPS Tutorial, 1996
The $V_t$ increases vs time on a log-log scale. The same relationship is observed for decrease in $I_{dlin}$ and $I_{dsat}$.

From Rudi Bellens, IRPS Tutorial, 1998
Characteristics of HCI Degradation

HCI degradation affects n-channel FETs (NMOS) much more than p-channel FETs (PMOS)
  – Affects high voltage I/O FETs more than core (low voltage) FETs

• HCI is a short channel effect
  – More pronounced as FET channel lengths decrease

• Hot carrier injection is very dependent on the details of silicon processing and device design
  – Source-drain implant profile
  – Gate oxide trap density
Effects of HCI on Circuits

• **Lowers the maximum operating frequency of digital circuits**
  – Mainly caused by decrease in \( I_{dsat} \)

• **Changes the characteristics of analog circuits**
  – Mainly caused by decrease in \( I_{dlin} \)
  – May affect “matched pairs” of FETs
Accelerators of HCI Degradation

- **Vds**
  - Roughly an exponential dependence on 1/Vds
    \[ \text{TTF} \sim \exp\left(\frac{A}{V_{ds}}\right) \]
    \[ \text{TTF} \sim (I_{sub})^{-m} \]
- **NMOS FET degrades fastest when Vgs = \sim 1/2Vds**
  - Substrate current is maximized for this condition
  - HCI measurements typically made using this worst-case stressing condition
  - This condition occurs when the FET is switching
    - Switching rate affects HCI degradation
- **HCI has practically no temperature dependence**
Substrate Current Dependence on $V_{gs}$ & $V_{ds}$

Max. $I_{sub}$ occurs at $V_{gs} \approx \frac{1}{2} V_{ds}$; $(I_{sub})_{\text{max}}$ highly dependent on $V_{ds}$

From T. Thurgate and N. Chan, Trans Electron Dev. 1885, p. 402
HCI Dependence on Substrate Current

Simplified model for lifetime prediction

\[ \text{TTF} \sim (I_{\text{sub}})^{-m} \]

\[ \Delta X/X = 10\% \]

From Rudi Bellens, IRPS Tutorial, 1998

Dennis Eaton (c) IEEE SSC Society Feb. 16, 2005
HCI Dependence on Vds

Simplified model for lifetime prediction

\[ \text{TTF} \sim \exp(\frac{A}{V_{dd}}) \]

Only valid for conditions of max. \( I_{\text{sub}} \)

From Rudi Bellens, IRPS Tutorial, 1998

Dennis Eaton (c) IEEE SSC Society Feb. 16, 2005
How Foundries Specify HCI degradation

• **Lifetime for a certain % decrease in Idsat**
  – % decrease in Idsat, often 10%
    • DC measurement made at high Vds and worst-case HCI conditions
  – Lifetime of 10 years for AC operation
    • Typically a factor of about 50 improvement going from worst case DC to typical AC

• **Some foundries specify lifetime for a certain % decrease in Idlin**
  – More sensitive measure of HCI than Idsat
  – Idlin can degrade more than twice as fast as Idsat

• **Foundry specs seem to be getting less stringent as channel lengths get shorter**
Characteristics of HCI Data

• **Considerable lot-to-lot variation**
  – Thus some production wafer lots will have considerably less margin than those measured

• **NMOS I/O FETs may be close to the spec limit**
  – Must be particularly careful when designing with minimum length I/O NMOS FETs
How a Designer Can Handle HCI

• Know how much Idsat or Idlin degradation to expect
  – From IC fab HCI specification and data

• Use SPICE parameters in circuit simulation for degraded FET
  – Take into account duty cycle of FETs with high Vds
  – Take into account switching frequency

• If necessary, use longer channel length for FETs (particular NMOS I/O) when they are subject to high voltage or switching frequency
Intrinsic Silicon Wearout Mechanisms

• Hot Carrier Injection (HCl)

• Gate oxide integrity (GOI)

• Negative Bias Temperature Instability (NBTI)

• Electromigration (EM)

• Stress Migration (SM)
Gate Oxide Wearout Summary

• Can be intrinsic or defect-related
  – Defective gate oxide can fail during the useful life of the part
  – The fab process is designed so that intrinsic gate oxide wearout is not significant in the first 10 years of life

• Even a non-defective gate oxide wears out in time
  – The wearout mechanisms are understood and modeled
  – Voltage and temperature accelerate the wearout

• The failure mode is increased current through the gate oxide leading to a gate short
What causes Gate Oxide Breakdown?

• Gate oxide is the thinnest film in the wafer
  – Composed of SiO₂ or silicon oxy-nitride
  – 16-25Å (1.6-2.5nm) in state-of-the-art silicon technologies

• High field across the gate oxide
  – 5-7MV/cm for current technologies (0.13μm and 90nm)
    • Over one-half the intrinsic breakdown field of the oxide

• GOX field causes physical changes in oxide film
  – Electrons and holes injected into traps in the oxide
  – Si—O bonds broken
  – Leakage occurs across the gate oxide
  – Eventually enough bonds are damaged that a direct short occurs across the oxide
Percolation Model of Gate Oxide Breakdown

- In the percolation model, traps (spheres) are generated randomly through the volume of the dielectric.

- If the spheres of two neighboring traps overlap, conduction is possible.

- Breakdown occurs when a conducting path is created from one interface to another.

- The parameters used to fit experimental data is the trap radius and the fraction of defects effective in initiating breakdown.


From J.S. Suehle and E.M. Vogel, IRPS Tutorial 2000

Dennis Eaton  (c)   IEEE SSC Society   Feb. 16, 2005
Effect of Gate Oxide Breakdown on Circuits

- Breakdown begins with increased leakage current across gate oxide
  - Tunneling current already significant in thin gate oxides
  - Increases gate leakage current of the die
  - Can change potential across gate oxide
    - Reduces current drive of the FET
    - May cause the circuit to operate more slowly

- Eventually a short forms between gate and substrate
  - FET ceases to function
  - Typically causes circuit failure
E Model for Gate Oxide Breakdown

Predicts that \( \ln(\text{TTF}) \) is proportional to electric field

\[
\ln(\text{Time to Failure}) \sim \frac{Q_1}{kT} + (-\gamma_E E)
\]

or

\[
\text{TTF} \sim \exp\left(\frac{Q_1}{kT}\right) \exp\left(-\gamma_E E\right)
\]

where\n
- \( \text{TTF} = \) time to fail
- \( Q_1 = \) activation energy required for bond breakage
- \( \) (also called \( \text{Ea} \))
- \( k = \) Boltzmann’s constant \((8.62 \times 10^{-5} \text{ eV/}^\circ \text{K})\)
- \( T = \) temperature \((^\circ \text{K})\)
- \( \gamma_E = \) field acceleration parameter \((\text{cm/MV})\) \((\gamma_E \text{ has a temperature dependence, which is often ignored when extrapolating TDDB data)\)}
- \( E = \) electric field across gate oxide
1/E Model for Gate Oxide Breakdown

• The 1/E model predicts that ln(TTF) is proportional to inverse electric field

\[ \ln(\text{TTF}) \sim \frac{Q2}{kT} + G \frac{1}{E} \]

or \( \text{TTF} \sim \exp[Q2/kT] \exp[G/E] \)

where \( Q2 = \) the activation energy for current-induced hole injection and capture in the gate oxide

\( G = \) electric field acceleration parameter (G also has a temperature dependence, which is usually ignored)

The other parameter have the same meanings as defined for the E model
Comparison of E and 1/E models

Study showing that the E model fits the data better at low electric fields. 1/E and E fit at high fields

J. McPherson et al., IEDM, 1998, p. 171
Factors Affecting Gate Oxide Breakdown

• **Temperature**
  – Time to failure $\sim \exp\{Q_1/[k*(\text{absolute temperature})]\}$
  – Lifetime decreases exponentially with increasing temp.

• **Electric field across oxide**
  – Time to failure $\sim \exp(-\gamma_E \cdot E)$
  – Lifetime decreases exponentially with increasing field (e.g. $V_g$)

• **Total gate oxide area on chip**
  – Time to failure $\sim [\text{gate oxide area}]^{1/\beta}$ ($\beta=$Weibull shape parameter of gate oxide failure distribution)

Accelerated Tests for GOI

- **Time Dependent Dielectric Breakdown (TDDB) test**
  - Most thorough test (get TDDB data if available)
  - A constant voltage, constant temperature test
  - Done at a variety of temps. and voltages on packaged parts
  - A very lengthy test
  - Allows extraction of acceleration model parameters ($Q_1$, $\gamma_E$)

- **Voltage Ramp test (V-Ramp test)**
  - Fast, wafer level test
  - Done on a large sample size
  - Voltage across oxide is increased in steps until oxide failure
  - Typical test for process monitoring

- **Charge to Breakdown ($Q_{bd}$), also called J-Ramp test**
  - Fast, wafer level test
  - Current through the oxide is increased until oxide failure
TDDB Failure Data at Different Temperatures

Example of TDDB raw data taken at different temperatures but at the same electric field. (lognormal failure distribution)

Determination of Thermal Activation Energy

\[ \text{Ln}(t_{50}) \text{ is plotted vs } 1/T \text{ in order to determine the thermal activation energy (Q1)} \]

From A. Yassine, et. al., Electron Device Lett. 20, No. 8, 1999, p. 390
TDDB Failure Data at Different Electric Fields

Example of TDDB raw data taken at different electric fields but at the same temperature. (lognormal failure distribution)
Determination of Field Acceleration Constant

\[
\ln(t_{50}) \text{ is plotted vs } E \text{ in order to determine the field acceleration parameter, } \gamma_E. 
\]

From A. Yassine, et. al., Electron Device Lett. 20, No. 8, 1999, p 390
Designing to Avoid Gate Oxide Wearout

- Respect the maximum voltage allowed across the gate oxide

- If you must exceed the maximum voltage
  - Obtain TDDB data from the silicon fab facility
  - Calculate the allowable duty cycle for the increased voltage
    - Take into account the total gate oxide area operated at the increased voltage
Recent News on Gate Oxide

• For 90nm, the E field across oxide is very high
  – $1/E$ model sometimes used to extrapolate TDDB data to use conditions (request supporting data)
  – A power law model is also being used:
    \[ \text{lifetime} \sim (\text{voltage})^{-r} \]
  – Weibull distribution used to analyze TTF data

• It is now becoming presumed that soft breakdown will occur during the useful lifetime
  – Transistor and inverter models being developed that take into account gate oxide leakage current
  – BSIM4 presently includes normal gate current

Intrinsic Silicon Wearout Mechanisms

• Hot Carrier Injection (HCl)
• Gate oxide integrity (GOI)
• **Negative Bias Temperature Instability (NBTI)**
• Electromigration (EM)
• Stress Migration (SM)
Negative Bias Temperature Instability Summary

• **Vt shifts to higher absolute value over time**
  – Affects p-channel FETs much more than n-channel FETs

• **Occurs under high absolute gate bias**
  – FET is in the ‘on’ state

• **Causes degradation in FET parameters**
  – $|V_t|$ becomes greater
  – Transconductance ($G_m$) becomes lower
  – $I_d$ for a given $V_{ds}$ and $V_{g}$ becomes lower

• **Accelerated by voltage and temperature**
### What Causes NBTI

#### Oxide Charges/Interface Traps

<table>
<thead>
<tr>
<th>Charge</th>
<th>Type</th>
<th>Location</th>
<th>Cause</th>
<th>Effect on Devices</th>
</tr>
</thead>
<tbody>
<tr>
<td>(1)</td>
<td>$D_{tr}(\text{cm}^{-2}\text{eV}^{-1})$</td>
<td>Interface Trapped Charge</td>
<td>SiO₂/Si Interface</td>
<td>Dangling Bonds</td>
</tr>
<tr>
<td>(2)</td>
<td>$N_p, Q_f$ (cm⁻², C/cm²)</td>
<td>Fixed Charge</td>
<td>Close to SiO₂/Si Interface</td>
<td>Si⁺ (?)</td>
</tr>
<tr>
<td>(3)</td>
<td>$N_{otr}, Q_{otr}$</td>
<td>Oxide Trapped Charge</td>
<td>In SiO₂</td>
<td>Trapped Electrons &amp; Holes</td>
</tr>
<tr>
<td>(4)</td>
<td>$N_m, Q_m$</td>
<td>Mobile Charge</td>
<td>In SiO₂</td>
<td>Na, K, Li</td>
</tr>
</tbody>
</table>

It is complicated, but is basically fixed charge and traps in the oxide and at the silicon interface, caused by a variety of defects and mechanisms.

Figure from D.K. Schroder, IEEE International Rel. Physics Symposium (IRPS) 2004, Tutorial
NBTI effect on FETs over time

\[ \Delta V_t \sim (\text{time})^p \]

where $p$ is the power law dependence. $p$ typically $<1$

S. Tsujikawa et al. IEEE International Rel. Physics Symposium (IRPS), 2003, p., 183
Higher (negative) gate voltage (i.e. gate electric field) accelerates the \( V_t \) shift. Dependence is exponential in this figure.

\[
\text{Lifetime} \sim \exp[-\gamma |V_g|]
\]

D. Schroeder, 2004 IRPS tutorial (from N. Kimizuka et. al. IEEE VLSI Symp., 2000 p. 92) (However, S. Tsujikawa et. al. (IRPS), 2003 show data indicating lifetime \( \sim V^b \))
Vt shift is accelerated by higher temperature (Arrhenius model)

Lifetime $\sim\exp[-Ea/kT]$

$Ea \sim 0.1 \text{ to } > 1 \text{ eV}$  Depends on oxide details. Fluorine increases $Ea$

S. Tsujikawa et. al. (IRPS), 2003, p 183  (see also C.H. Liu et. al, IEDM 2001)
Effect of Gate Oxide Thickness

$\Delta V_t \sim -C*(\text{gate oxide thickness})$

SNBTI = static NBTI;  DNBTI = dynamic NBTI (discussed later)

G. Chen et. al. IEEE International Rel. Physics Symposium (IRPS) 2003, p. 196
Effects of NBTI on Devices and Circuits

• P-channel drain current decreases over time
  – Causes the circuit to run slower
    • Increases delay times
    • Circuit may no longer operate at specified frequency

• Can cause analog FETs to become mismatched
  – Changes behavior of differential circuits
Why NBTI Has Become a Concern

- **NBTI is present at device operating conditions**
  - Significant degradation can occur at 100°C, 6MV/cm
    - Typical operating conditions for 90nm technology
- **Present NBTI reliability tests not very accelerated**
  - Often 150°C, 1.1 x nominal Vdd
- **Effect greater as oxide thickness decreases**
- **Not a strong decrease in NBTI at longer channel lengths**
  - Can’t necessarily make it go away like HCI by making channel longer
NBTI Under Dynamic Circuit Conditions

Typical inverter undergoes dynamic rather than static stressing.

**Lifetime is an order of magnitude better under dynamic stress**

G. Chen et. al.  IEEE International Rel. Physics Symposium (IRPS) 2003, p. 196
How foundries specify NBTI degradation

- Data from foundry typically 1 data point:
  - $\Delta V_t$ given as a percentage change or millivolt change in $V_t$
    - $T$ is specified (typically 150$^\circ$C)
    - Voltage is specified
    - Time is specified
What a Designer Can Do

• Determine the amount of degradation which would occur during your product’s lifetime and design accordingly
  – Calculate the Vt of FETs at end of life derate library performance on SPICE models at slow corner

• Note that AC operation is an ally
  – Determine the amount of degradation under AC conditions so that you don’t have to over design
Intrinsic Silicon Wearout Mechanisms

- Hot Carrier Injection (HCl)
- Gate oxide integrity (GOI)
- Negative Bias Temperature Instability (NBTI)
- Electromigration (EM)
- Stress Migration (SM)
What is Electromigration?

Movement of metal atoms under the influence of electron flow (current) and temperature

- Occurs in metallization lines, contacts, and vias (both Al and Cu)
  - Driving force is “electron wind”—momentum transfer between electrons and metal atoms
- Accelerated with temperature and current density
- Can cause
  - increased resistance (due to void formation)
  - open circuit (due to void formation)
  - shorts between adjacent lines (due to extrusion)
- Manufacturing defects can cause premature EM failure
Electromigration Open Failures

Examples of opens in aluminum metal lines caused by electromigration

From J. McPherson, IRPS Tutorial, 1989
Electromigration Extrusion Failure

Shorted aluminum lines caused by extrusion of metal which bridges two adjacent lines

From J. McPherson, IRPS Tutorial, 1989
Model for Electromigration

The most commonly used model is Black’s model (first proposed by James Black)

It fits most EM data quite well, both Al and Cu

Time to failure = $TTF \sim J^{-n} \exp(Ea/kT)$

where

- $J =$ current density (A/cm$^2$)
- $n =$ experimentally derived current density exponent
- $Ea =$ experimentally derived activation energy (eV)
- $k =$ Boltzmann’s constant ($8.62 \times 10^{-5}$ eV/°K)
- $T =$ Temperature (in degrees Kelvin)
Testing for Electromigration

- Test structure for metal lines is a long metal line contacted at each end.

- Test structure for vias and contacts is a string of vias (contacts) between adjacent metal levels.
  - The metal lines connecting the vias are wide enough that the vias fail, not the metal lines.

- Test structure considered to fail if resistance increases 10% or 20%.
Electromigration Test Details

• Test done on packaged parts
  – Ceramic packages to withstand high temperature
  – Wire bonded with Al wire on Al pads to avoid Al/Au intermetallics

• Groups of parts are stressed at different J’s and T’s
  – Constant temperature for each group (usually 180-350°C)
  – Current densities (J) typically $1 \times 10^6$ – $5 \times 10^6 \ A/cm^2$ (DC)
    • J determined from cross-sectional area of line or via
  – Constant monitoring of device resistance
  – Duration of up to several thousand hours

• Test is performed on contacts and representative metal levels and via levels
Electromigration Data

• A failure in an EM test is defined to be a certain percentage increase in resistance of the test structure
  – 10% and 20% resistance increase are typical failure criteria

• Electromigration failures are found to follow a lognormal distribution
  – Time to failure is plotted on the x axis on a log scale and normal probability is on the y axis
  – For lognormally distributed data, this plot is a straight line
  – From the best line through the data, the median time to failure (t50) and the sigma are determined.
    • The sigma obtained is the lognormal sigma
Example of EM Failure Data

EM failures follow a lognormal distribution. Plot of normal probability vs. log(time) is a straight line.
A. Oates, IRPS tutorial, 1994
Extracting n from EM TTF vs J data

The slope of log(t50) vs log(J) for data taken at a constant temperature gives the current density exponent, n. The slope of this graph is -n.
The slope of log(t50) vs 1/T(°K) for data taken at a constant current density gives the activation energy, Ea.

From J. Towner, IRPS 1985, p. 81
Getting from $t_{50}$ to $t_{0.1}$

Suppose you know $t_{50}$ and want to know $t_{0.1}$

- From a normal distribution, we know $F(Z) = \mu + Z*\sigma$
- For a lognormal distribution $F(Z) = \exp(\mu + Z*\sigma) = \exp(\mu) \cdot \exp(Z*\sigma)$
- But $\exp(\mu)$ is $t_{50}$
- $F(Z)$ is the CDF at $Z$ standard deviations from the mean
- $F(Z) = 0.001$ (i.e. 0.1%) at $Z = -3.09$ (i.e $3.09\sigma$ below the mean)

Therefore $t_{0.1} = t_{50} \cdot \exp(-3.09\sigma)$
EM specifications stated in several ways

- Maximum current density (MegaAmps/cm$^2$)
- Maximum current per width of metal line (mA/$\mu$m)
- Maximum current per contact or via (mA)

You may have to convert from one form to another
Converting from mA/μm to A/cm²

• The thickness of the metal lines for a particular metallization level is given in the design rules for the process
  – Suppose the metal thickness is \( d \) (in μm)
  – Suppose the EM spec is given as \( C \) mA/μm (\( C \) is the allowed current per μm of metal line width)
  – You want to find the current density spec \( J \), in A/cm²

\[
J = C \text{ mA/μm} \times d \text{ μm} = C \times d \text{ mA/μm}^2
\]

\[
J = C \times d \text{ mA/μm}^2 \times 10^{-3} \text{ A/mA} \times 10^8 \text{ μm}^2/cm^2
\]

\[
J = 10^5 \times C \text{ mA/μm} \times d \text{ A/cm}^2
\]
Foundry EM Test Results

- EM test results presented for maximum Tj either as
  1. Maximum current density at which t0.1 is x years
     - x is typically 10 years or 11.41 years (100,000 hours)
  2. t0.1 at the maximum current density allowed by the design rules

You may have to convert from one form to another
Converting EM results from “Max. J allowed for useful life” to “lifetime at max spec”

- EM results are usually reported in one of two ways
  - A. The t0.1% at the maximum DC current density allowed by the design rules at the maximum Tj
    - Example: For the metal 1 level, the maximum current specified by the design rules is 0.5MA/cm². The t0.1% is 155.9 years
  - B. The maximum DC current density allowable at the max. Tj which will result in 0.1% failures at the specified lifetime
    - Example: The EM lifetime for t0.1% is 10 years. The maximum allowed current which meets this criterion is 2.3MA/cm²

- If the EM results are given as in B and you want them as stated in A, how do you convert?
Converting EM results - 1

• **Definitions**
  - $t_a$ = the actual EM lifetime to 0.1% fail
  - $t_s$ = the specified EM lifetime to 0.1% fail
  - $J_m$ = Maximum specified allowed current density in design rules
  - $J_a$ = Actual current density that will give 0.1% fails at time $t_s$
  - $A$ = Constant in Black’s equation
  - $n$ = current density exponent in Black’s equation (assumed known)
  - $E_a$ = activation energy in Black’s equation (assumed known)
  - $T$ = maximum specified $T_j$

• $t_a$ is the desired quantity. All other quantities are known.
• From Black’s equation, we have
  $$t = A \cdot J^{-n} \cdot \exp\left(\frac{E_a}{kT}\right)$$
Converting EM results - 2

For $t_s$ we thus have

$$t_s = A*J_a^{-n} \exp\left(\frac{E_a}{kT}\right)$$

For $t_a$ we have

$$t_a = A*J_s^{-n} \exp\left(\frac{E_a}{kT}\right)$$

Divide the two equations obtaining

$$\frac{t_a}{t_s} = (J_s/J_a)^{-n} = (J_a/J_s)^n \quad (E_a \text{ does not need to be known})$$

Thus

$$t_a = t_s \cdot (J_a/J_s)^n$$

Example: use the values from two slides ago and assume $n = 1.8$:

$$t_a = 10yr \cdot \left(\frac{2.3MA/cm^2}{0.5MA/cm^2}\right)^{1.8} = 155.9 \text{ years}$$
Characteristics of EM Test Results

• Considerable lot to lot variation
  – A factor of two between lots is not uncommon
    • EM is dominated by grain boundary and surface effects so is harder to control
  – Not a problem provided that lowest values have some margin above the specification
    • Can be a concern if most of the results are near the spec limit

• Vias typically have less EM margin than metal lines
  – Designers use multiple vias to reduce J in any one via

• Consistency between lots and low sigma indicate excellent metal deposition and etch control
How a Designer can Mitigate EM

• Have your product’s Tj below the maximum allowed for the technology

• Keep maximum current in lines and vias well below the spec limit

• Use multiple vias where possible to decrease current density per via
Intrinsic Silicon Wearout Mechanisms

- Hot Carrier Injection (HCl)
- Gate oxide integrity (GOI)
- Negative Bias Temperature Instability (NBTI)
- Electromigration (EM)
- Stress Migration (SM)
Example of Stress Migration

STRESS-INDUCED NOTCHING/VOIDING

Example of notches and voids in aluminum metallization caused by stress voiding

from J. W. McPherson, IRPS Tutorial, 1989
Stress Migration Mechanism

• Stress in the metal caused by residual stresses from the interlayer dielectric films
  – Films are stress free at the deposition temperature
  – Not stress free at room temperature or operating temperature

• Metal atoms and vacancies to move to relieve the stress caused by CTE mismatch
  – Vacancies in the metal aggregate to form voids
  – Voids become larger, increasing the resistance of a metal line or via
  – Eventually, line or via becomes open
Characteristics of Stress Migration

• Affects both aluminum and copper metallization systems

• Accelerated with temperature up to a point
  – Not as high a temperature acceleration as other wearout mechanisms

• Occurs both while product is operating and while idle
  – Can cause the part to fail on the shelf, before it is used

• Can be alleviated by both processing and design
  – Processing steps designed to minimize residual stress
  – Redundant vias
  – Slotted metal lines
Accelerated Testing of Stress Migration

• Packaged parts or wafers put in high temperature for an extended time
  – Temperatures between 150 and 250°C
  – Typical times are 500 to 3,000 hours

• Package type may affect stress migration
  – Can affect the stress in the upper layers
  – More important for low-k dielectric
Model for Stress Migration

• Thermomechanical stress model (Texas Instruments)

\[ \text{TTF} \sim (\text{T}_o - T)^{-n} \exp(\text{E}_a/kT) \]

where \( \text{T}_o = \) stress free temperature for the metal (approx. dielectric deposition temperature)

\( n = 2-3 \)

\( \text{E}_a = \) activation energy (eV)

0.5-0.6eV for grain boundary diffusion

\sim 1\text{eV for intra-grain (bamboo) diffusion} \)

The model predicts that the minimum TTF (maximum acceleration) will occur between 150-230°C
Stress Migration in Copper Vias

courtesy of Alvin Loke
Thermomechanical Model Applied to Cu Vias

Plot of model equation compared with experimental lifetime for copper vias. Because of the competing terms in the model, the acceleration factor is never very high.

Stress Migration in Copper

Cu vias are very susceptible to Stress Migration

The thermomechanical model fits SM in copper vias
(E. T. Ogawa et. al, IRPS, 2002, p. 312—graph in previous slide)

Empirically determined constants: $E_a=0.74$ eV, $n=3.2$, $T_0=270^\circ C$

The effect is much worse when there is a single via contacting a wide metal line below
- There is a large reservoir of vacancies in the wide line
- Vacancies migrate to the via and cause a void
- Problem enhanced by defects in the barrier liner around the copper (allows vacancy migration into via)
- Problem mitigated by using multiple vias in wide lines
How Foundries Specify Stress Migration

• Foundries typically run stress migration tests on “worst case” metal and via structures
  – Constant temperature for a predetermined duration
  – No bias

• Specification is number of failures (typically zero) out of the total sample size
  – Best if SM test is run for several thousand hours

• Foundries do supplemental testing to validate design rules
Mitigating Stress Migration in Circuit Design

• **Use multiple vias wherever possible**
  – Foundries now have design rules for number of vias

• **Particularly avoid connecting a minimum width line to a wide line above or below with a single via**
  – Foundries have design rules for the maximum width of line that may be connected by a single via
Summary

All I really needed to know about device reliability I learned in kindergarten

Follow the design rules

Design conservatively
   Recognize that devices will slow down over time

Hold your silicon suppliers accountable
   Insist on adequate data to support acceleration models

Stick together
   Have the packaging folks help reduce Tj
   Consult your friendly reliability engineers