IEEEscvban.gif (7997 bytes)Reliability

IEEE - SCV - Reliability

Home | Events | Directions | Archives | Education | Links | Jobs | Officers

colorbar.gif (4491 bytes)

Archives

Date

Type

Subject

Speaker

Place

Oct 30th, 2009 1/2 day Special Event Impact of Packaging Materials and Processes on Device Soft Error Rates.  Co-sponsored by SVC Reliability, Cisco Systems, SVC CMPT. (Slides and videos) Richard Wong, Dr. Brett Clark, Brendan Dwyer-McNally, Andy Tseng, Charles Slayman, Jeff Wilkinson Cisco Systems, San Jose, CA
Oct 28th, 2009 Seminar Does Silicon 'Wear Out'?: An OEM's Perspective(Abstract) (Slides) Dr. Craig Hillman, CEO, DFR Solutions HP-Cupertino
Sept 30th, 2009 Seminar Green Reliaiblity (Abstract) (Slides) Mike Silverman of Ops Ala Carte HP-Cupertino
May 27th, 2009 Seminar Life Test Case Study for Heater Cartridge Reliability(Abstract) (Slides) Jim Krepelka HP-Cupertino

May 13th, 2009 combined meeting with CPMT

Special Event

Through Silicon Vias (TSVs): Design and Reliability (Abstract) (Slides)

Sergey Savastiouk, Ph.D.

Biltmore, 2151 Laurelwood Rd, Santa Clara, CA   408 346-4620

May 8, 2009

Special Event

Green Reliability  (Abstract) (Slides will be posted to the Ops Ala Carte page and then linked here)

Dr. Cheemin Bolinn

Dr. Alan Wood

Bryan Stallard

De Anza College, Cupertino, CA

March 25, 2009

Seminar

Modeling HDD Component Reliability, Application of Reliability Physics in Predicting Drive Failures.  (Abstract)(Slides)

Alexander Parkhomovsky, PH.D.

HP-Cupertino

Feb 25, 2009

Seminar

Best of RAMS (Abstract)

(Click here for a few pictures from RAMS)

Panel, RAMS attendees

HP-Cupertino

Jan 14, 2009

Seminar

Design for Reliability in a SemiConductor World

(Abstract)(Slides)

Mr. George Denes from Ops A La Carte

Applied Materials,

San Jose, CA

Oct 29, 2008

Seminar

Photovoltaic Systems Reliability (Abstract)(Slides)

Dr. Jim Loman

HP-Cupertino

Oct 14, 2008

Seminar

Soft Errors – No Way To Escape (Abstract)

Dr. Helmut Puchner

National Semiconductor

Sept 24, 2008

Seminar

Solid State Drive (SSD) Reliability (Abstract)( DFR Slides) (SSD Rel Slides)

Dr. John McNulty

HP-Cupertino

May 28, 2008

Seminar

Electronic Prognostics (EP). (Abstract)(Slides)

Dr. Kenny Gross

HP-Cupertino

April 30, 2008

Seminar

Reliability Performance & Measurement of Repairable Systems. (Abstract)(Slides)

Dr. Wendai Wang

HP-Cupertino

April 24, 2008

Seated Lunch

CPMT joint session with SCV Rel.

"Sustainable Information Technology Ecosystem:  Optimizing Datacenter Power and Cooling."

(Abstract)

Chandrakant Patel

Sunnyvale, CA

 

April 9, 2008

Seated Dinner

CPMT joint session with SCV Rel.

"A New Perspective on Electronic Product Reliability: Prognostics and Health Management."

(Abstract)

Prof Michael Pecht

Sunnyvale, CA - see link for details

Mar 26, 2008

Seminar

Leading Indicators: A More Effective Method of Accelerated Life Testing (Abstract)(Slides)

Arthur Zingher

HP-Cupertino

February 27, 2008

Seminar

Best of RAMS (Abstract)

Panel, RAMS attendees

HP-Cupertino

January 23, 2008

Seminar

Best of ISTFA (Abstract)

Panel

HP-Cupertino

October 24, 2007

Seminar

Formation of a Warranty Chain Management Institute and its Applicability for Reliability Engineers (Abstract)(Slides)

Glen Griffiths, Allison Griffiths

HP-Cupertino

September 26, 2007

Seminar

Early Reliability Testing (Abstract)

Mike Silverman and Arthur Zingher

HP-Cupertino

May 30, 2007

Seminar

Designed Experiments and Reliability (Abstract)

Ed Russell

HP-Cupertino

April 25, 2007

Seminar

Mean Time to Data Loss: A Poor Choice for Assessing RAID Reliability (Abstract)

Jon Elerath

HP-Cupertino

March 28, 2007

Seminar

Design Traits of Effective Reliability Programs (Abstract) (Slides)

Fred Schenkelberg

HP-Cupertino

March 7, 2007

Seminar

Best of RAMS (Abstract)

Mike, Fred & Panel

HP-Cupertino

Jan. 31, 2007

Seminar

Best of ISTFA (Abstract)

Art & Panel

HP-Cupertino

October 25, 2006

Seminar

Trapped by MTBF? (Abstract)

Fred Schenkelberg

HP-Cupertino

Sept 27, 2006

Seminar

Lot Acceptance Test: A Viable Solution for Parts Incoming Inspection (Abstract)

Sorin Witzman

HP-Cupertino

May 31, 2006

Seminar

FA benefits, logistics, and limitations(Abstract)

Sorin Witzman and Fred Schenkelberg

HP-Cupertino

April 26, 2006

Seminar

Competitive Teardown Analysis(Abstract)

Doug Farel

HP-Cupertino

March 29, 2006

Seminar

Design for Warranty (DfW) Cost Reduction(Abstract)

Doug Farel

HP-Cupertino

March 1, 2006

Seminar

Best of RAMS(Abstract)

Mike & Fred

HP-Cupertino

Jan. 18, 2006

Seminar

Best of ISTFA(Abstract)

Art & Panel

HP-Cupertino

Oct. 26, 2005

Seminar

ESD Qualification Testing Needs to Grow Up (Abstract)

Jon Barth

HP-Cupertino

Sept. 28, 2005

Seminar

Built-In Soft Error Resilience for Robust System Design (Abstract)(Slides)

Subhasish Mitra

HP-Cupertino

Sept. 14, 2005

Seminar

Kirkendall Voids in Lead-Free Solder Joints: A Reliability Issue (Abstract)

Zequn Mei

 

May 25, 2005

Seminar

Best of ARS (Applied Reliability Symposium) (Abstract)(Slides1)(Slides2)

David Trindade, Mike Silverman, Fred Schenkelberg

HP-Cupertino

April 27, 2005

Seminar

When to use HALT and when to use ALT (Abstract)(Slides)

Mike Silverman

HP-Cupertino

March 23, 2005

Seminar

How cosmic rays cause computer downtime(Abstract)(Slides)

Ray Heald

HP-Cupertino

Feb. 23, 2005

Seminar

Best of RAMS(Abstract)

Panel

HP-Cupertino

Jan. 26, 2005

Seminar

Best of ISTFA(Abstract)

Panel

HP-Cupertino

Oct. 27, 2004

Seminar

Reliability Horror Stories(Abstract)

Jurek Zarzycki

HP-Cupertino

Sept. 29, 2004

Seminar

Are You Analyzing Reliability Data Correctly? Repairable Vs. Non-Repairable Systems: There Is a Difference(Abstract)(Slides)

David Trindade

HP-Cupertino

June 23, 2004

Seminar

Design and Analysis of Accelerated Reliability Tests(Abstract)(Slides)

Larry George

HP-Cupertino

May 26, 2004

Seminar

Cisco's High Level Failure Analysis Process(Abstract)

Dennis Pachuki

HP-Cupertino

April 28, 2004

Seminar

A Reliability Engineer's Use of Warranty Cost Information (Abstract)(Slides)

Fred Schenkelberg

HP-Cupertino

March 31, 2004

Seminar

Moving from ORT to HASA (Abstract)

Mike Silverman

HP-Cupertino

Feb 25, 2004

Seminar

Best of RAMS (Abstract)

Panel

HP-Cupertino

Jan 28, 2004

Seminar

Best of ISTFA (Abstract)

Panel

HP-Cupertino

Oct 29, 2003

Seminar

Power Supply Reliability – an Oxymoron? (Abstract) (Slides)

Dave Christiansen and Brooks Leman

HP-Cupertino

Sept 24, 2003

Seminar

To CRE or not to CRE? (Abstract)

Mike Silverman

HP-Cupertino

Aug 27, 2003

Seminar

How to make a CFO care about Reliability (Abstract) (Slides)

Alan Wood

HP-Cupertino

Jun 25, 2003

Seminar

SoC Defect Testing ( Abstract)

Samiha Mourad, Yacoub Elziq

HP-Cupertino

May 28, 2003

Seminar

Reliability of IC Packaging ( Abstract) (Slides)

Joseph Fjelstad

HP-Cupertino

Apr 30, 2003

Seminar

Reliability Evolution through Product Lifecycle Phases ( Abstract) ( Slides)

Lalit A Patel

HP-Cupertino

Mar 26, 2003

Seminar

Server Class Disk Drives: How Reliable are They? ( Abstract) ( Slides)

Jon Elerath

HP-Cupertino

Feb 26, 2003

Discussion

Best of RAMS ( Abstract)

Panel

HP-Cupertino

Jan 29, 2003

Discussion

Best of ISTFA ( Abstract)

Panel

HP-Cupertino

Jan 28-29, 2003

2-day Course

Reliability Concepts and Practices ( Abstract)

Mike Silverman

HP-Cupertino

Oct 30, 2002

Seminar

Accelerated Testing as Part of a Traditional Reliability Program ( Abstract)

Mike Silverman

HP-Cupertino

Sep 25, 2002

Seminar

Accelerated Life Testing in Micro- and Opto-Electronics: Its Objectives, Role, Attributes, Challenges, Pitfalls, Predictive Models, and Interaction with Qualification Tests ( Abstract)

E Suhir

HP-Cupertino

Aug 28, 2002

Seminar

Real-World Software Reliability Overview ( Abstract)

Alan Padula

HP-Cupertino

Jul 31, 2002

Seminar

Monitoring IC Degradation Internally ( Abstract)

Ted Lundquist

HP-Cupertino

colorbar.gif (4491 bytes)

 

Date

 October 28, 2009

Topic Does Silicon 'Wear Out'?: An OEM's Perspective
Abstract As integrated circuits continue to improve their performance through the reduction of physical dimensions, OEMs are increasingly concerned with the long-term implications. As stated recently by IEEE Spectrum,

”The notion that a transistor ages is a new concept for circuit designers,” … aging has traditionally been the bailiwick of engineers who guarantee the transistor will operate for 10 years or so…But as transistors are scaled down further and operated with thinner voltage margins, it’s becoming harder to make those guarantees… transistor aging is emerging as a circuit designer’s problem.

Unfortunately, the current industry structure tends to prevent effective communication and data gathering on this issue, as development activities are increasingly segmented and markets penalize those organizations that are open about existing and future reliability concerns.

This talk will review some of these struggles and roadblocks, discuss the typical OEM approach for handling these issues, and present a new web-based tool being developed by a consortium of industry and government organizations that is designed to provide insight to this real risk to lifetime performance. This tool employs data and formulae from semiconductor materials, circuit fundamentals, transistor behavior, circuit design and fabrication processes to calculate not only a failure rate, but give confidence intervals and produce a lifetime curve based on steady state and wearout behavior.

Speaker Dr. Craig Hillman, CEO, DFR Solutions
 

Top

Date

 September 30, 2009

Topic Green Reliability
Abstract
Today, the topic of Green is discussed more and more. Every day we hear about companies "going green". 
But what does this really mean?
 
In fact, "Going Green" has many implications, from the materials being used to the type of energy being used 
and the quantity being consumed.  And each aspect of "Going Green" has reliability implications.  In fact, anytime we change
material properties or design concepts, there are inherent reliability risks that need to be addressed.  Here is a partial list of
the high tech industries involved in the movement:
○ Solar 
○ Water Purification
○ Wind 
○ Telecommunications
○ Battery 
○ Green Cooling
○ Nuclear 
○ Server Farms
○ Electric/Gas Meters 
○ Electric Vehicles
○ Generators (methane) 
○ Fuel Cell 
Reliability is going to play an important role for all of these industries because these industries are trying to take the place of an 
incumbent – must be as good or better:
○ Higher Reliability Demands
○ Higher Availability Demands
○ Higher Warranty Requirements
○ New Materials/New Risks
○ Pressure on reducing power
Speaker

Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs. He has 22 years of reliability and quality experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

 

 Top

Date

 May 27, 2009

Topic

 Life Test Case Study for Heater Cartridge Reliability.

Abstract

Often technical innovations come at a price. The constraints of time to market, physical size, and cost can leave designers faced with components at or beyond their specified limits. This presentation deals with just such a case, a wire wound NiChrome heater cartridge being operated beyond its thermal limit.

Early tests found that the heater cartridge could be run at over-temperature conditions for extended periods. However since the manufacturer’s published thermal limit lined up well with comparable products from competitors, the design team took this issue very seriously. Alternate technologies were evaluated but rejected as not meeting some or all of the size, cost, and time to market constraints. The Reliability Engineering group was asked to assist by setting up a life test.

Deliverable #1: At the end of the test, the reliability goal should have associated confidence estimates spanning an 18 month normal use period.

Deliverable #2: At the end of the test, the failure modes and mechanisms should be ranked in Pareto order according to their frequency.

The life test investigation faced two significant constraints. First it needed to be completed in 5 months. Second there was only enough hardware to construct two dedicated test platforms.

Speaker

Jim Krepelka is part of the Life Science and Chemical Analysis Division at the Santa Clara Site of Agilent Technologies. For the last 9 years, he has coordinated both reliability test and environmental test activities for diverse products such as mass spectrometers, genetic array scanners, and laboratory robotics. Prior to his Reliability Engineering role, he spent 6 years in R&D for mass spectroscopy at the Chemical Analysis Solutions Division of Hewlett Packard. Prior to his R&D role, he spent 5 years in Manufacturing Engineering for spectrophotometry and bio-reagent products at the Scientific Instruments Division of Hewlett Packard. He received his BSEE degree from San Jose State University and a BS in Aviation Management from the University of Southern Illinois. He is a senior member of the ASQ and is an ASQ Certified Reliability Engineer (CRE).

 

 Top

Date

 May 13, 2009

Topic

 Through Silicon Vias (TSVs):  Design and Reliability

Abstract

Thinned wafers with through-silicon isolated metal vias (TSVs) open a valuable new design opportunity for both IC and package engineers. Through-silicon metal connectivity enables electrical and thermal performance advantages, back-side connectivity for two-sided semiconductor wafer and chip-level testing, as well as vertical interconnections for 3D IC stacking and micro- and opto-electronics.


The benefit of IC vertical interconnects such as TSVs has been presented in many papers. The term TSV was introduced by the presenter in 1996 and published in 2000. However, the physical design and reliability issues associated with copper through-silicon vias have not been fully resolved. The following problems present process challenges: metal voiding during filling, uniform via wall material deposition, and active IC surface connectivity to name a few.


Copper vias fabricated in a silicon wafer impose, at high temperatures, tensile stresses in silicon. The situation might be aggravated by the stress fields due to numerous vias and, if the vias are placed too close to each other, the thermally induced stresses might lead to cracking of the silicon wafer. In addition, the vias experience compressive 'hoop' stresses. These stresses could lead to the via buckling.


The presentation will address other design and reliability issues.

Speaker

Sergey Savastiouk, Ph.D. is the founder and CEO of ALLVIA, Inc., the first TSV foundry. He received his Ph.D. in EE from Moscow University and began his career as a Professor at Santa Clara University in 1993. After completing his MBA program in 1997, he founded Tru-Si Technologies, Inc., which pioneered ultra-thin (50um) wafer packaging equipment and through-silicon vias applications. Atmospheric Downstream Plasma (ADP) systems and NoTouch handling solutions have been used in production of smart cards and ultra-thin packages. In 1996, in the business plan for Tru-Si Technologies, he introduced the term "Through Silicon Vias (TSV)" which is now widely used in semiconductor and MEMS industries. He also published the term "TSV" in Solid State Magazine in January 2000. In 2004, he founded ALLVIA, Inc., a through-silicon via (TSV) specialty foundry, which has been commercializing its TSV capabilities for semiconductor, RF and MEMS industries. Dr. Savastiouk has authored numerous articles and received patents on TSV processes, equipment and applications.

 

 Top

Date

May 8, 2009

Topic

Green Reliability Event

Abstract

Today, the topic of Green is discussed more and more. Every day we hear about companies "going green". But what does this really mean?

In fact, "Going Green" has many implications, from the materials being used to the type of energy being used and the quantity being consumed. And each aspect of "Going Green" has reliability implications. In fact, any time we change material properties or design concepts, there are inherent reliability risks that need to be addressed.

Speaker and Abstract

Dr. Cheemin BoLinn, Peritus Partners: Getting the Bang - Green Agile Solutions Designed for Eco- Environmental Impact!

As significant investments in green technology continue, there are immense opportunities to demonstrate leadership in designing reliable green technologies and products. These green solutions can fulfill the need for increased power with a smaller or net zero carbon footprint.

However, robust reliability and maintenance practices will need to continue to be rapidly deployed. This focus will be required throughout the product life cycle of these green solutions, from concept design to prototype to manufacturing and throughout the ecosystem including supply chain and disposal. Even the traditional perspective of a data center has already given way to a broader ecosystem view that focuses on designing reliable energy efficiency not only within, but also among the data center, utilities, facilities, and surrounding environments. To receive optimal benefits, the "best practice" tools and processes utilized in other industries plus new agile innovations will be required for this new clean tech industry. These innovative solutions will be the bellwether that demonstrates new energy efficiency measures will not cost reduced reliability.

Green solutions, proactively designed from concept stage, with the fundamental precept of reliability and energy conservation and reuse, have a strong business value proposition. Green reliability drives increased profits which can be a new funding source for increased R&D to fuel continuous improvement, critical for sustainable product development. Green reliable solutions have a compelling value proposition not only for the environment but for the profitable growth of businesses.

This session will examine:

1) Trends and directions in the clean tech industry for green, reliable solutions
2) Hotbed investment areas
3) Ecosystem view of energy efficiency
4) Business and environmental impact of green reliability and efficiencies
5) Value proposition for green "carbon neutral" technical designs
6) "Greening" as an economic growth driver and implications for marketing

Speaker and Abstract

Alan Wood, Sun MIcrosystems: The End of Redundancy - alternative methods for achieving high reliability

Minimizing power consumption has become a very important topic in the design of components, computers, and data centers. Industry analysts predict that, within the next few years, the cost of power will surpass the cost of compute equipment in the data center. Dependability research has not traditionally considered the cost or availability of power to implement the proposed techniques. If redundant equipment was needed for fault-tolerance, it was blithely assumed that the power and other facilities overhead was negligible. That assumption is no longer valid, and dependability research and practice needs to change accordingly.

Many classical fault-tolerance techniques, such as voting, assume redundant compute resources. Having redundant hardware to derive duplicate results minimizes single point failures but is unlikely to be popular in a power-constrained data center. As an example, the IBM Z-series microprocessor has moved from duplication to error checking and sparing to save power. This type of dependability design shift is likely to accelerate. Active redundancy will not be affordable in many data centers. Even passive redundancy may be questioned because it takes space, and space usually equates to power, especially if the "sleep" mode for the passive components does not do a better job controlling leakage current.

At the chip level, the drive to lower voltages and power along with other factors will make logic more vulnerable to soft errors. Thus, the error rate is likely to increase as the redundancy decreases, making dependable computing a greater challenge. Techniques that detect/correct errors with minimal power consumption, such as information redundancy and self-checking logic together with some form of retry mechanism, will be favored. Power impact analyses of the proposed dependability technique may be come more important than the traditional analyses of the performance and chip area impact.

At the server level, it will be important to find different architectures and software approaches for dependability in the absence of hardware redundancy. It may be possible to consider trading recovery time for power. As flash memory is used as a partial DRAM replacement in some applications, it may be possible to consider storing checkpoints or other data in flash memory to improve dependability. More than ever, dependability research and practice will need to focus on the entire system to devise methods that do not require additional power.

At the data center level, operations may impact dependability. Equipment may be turned off to save power, causing more power and temperature cycling. Equipment with a sleep mode that uses minimal power but can quickly be activated when new tasks arrive will be valuable. Functions such as automatically throttling power when not needed and providing real-time information on power utilization will also be popular. Data center cooling may actually improve as power-saving features such as hot-aisle containment that reduce hot spots are implemented.

The dependability community is faced with the challenge of using less power while maintaining the same levels of dependability. Existing assumptions need to be altered, and new research is needed.

Speaker and Abstract

Bryan Stallard, Ops A La Carte: EDA Usage in Reliability Aspects of Green Technology

The use of Electronic Design Automation (EDA) in GREEN situations, specifically in lowering energy consumption, is a natural extension of long-existing usage in two related contexts: thermal management in complex systems, and portable-device battery-limited usage. For years semiconductor companies, and their EDA suppliers, have battled power dissipation issues, driven by the twin facts that MOS ICs burn power proportional to clock frequencies (ignoring minimal idling currents), and that as majority-carrier devices, the MOSFET elements have channel resistances that increase with the absolute temperature, ultimately increasing RC time constants until they limit both raw speed and race-condition margins.

At board levels, the issue bifurcates between commercial usages, where convective cooling is dominant, and military/aerospace use where conductive cooling is primary. Both high-altitude and space applications essentially discount convection completely. In neither
group is there any appreciation of excessive power dissipation.

In portable devices, the ability to use a product without continuous recharging has long been a primary selling point for premium offerors, and a source of irritation for generic knock-off users. So, mobile devices are already biased (sic) toward energy efficiency.
Three topics are worth considering, or revisiting, in a GREEN light:
1) Conversion of existing circuitry to emphasize or improve energy-consumption performance;
2) Modification of production designs to meet RoHS or other material issues;
3) Development of ongoing engineering practices to institute or upgrade Green features to a level parallel with established customer "-ility" expectations (reliability, quality, availability, maintainability).

 

Top

Date

Mar 25, 2009

Topic

Modeling HDD Component Reliability, Application of Reliability Physics in Predicting Drive Failures.

Abstract

Significant cost savings in high volume product development and manufacturing were achieved by addressing reliability design flaws early in the stage gate process. Critical Path Analysis (CPA), Failure Mode and effects Analysis (FMEA), and Fault Tree Analysis were effectively used to identify critical design flaws at early stages of spindle motor product development and demonstration. Understanding of physics of failure and performance characteristics of the product was used to develop a number of predictive reliability models and highly accelerated engineering stress tests.  The models were developed from the first principles as well from the design of experiment(DOE) approaches. They were used to evaluate design feasibility and to provide an early feedback to the design engineers. The tests were implemented for design validation at prototype build stages as well as for the continuous identification of process excursions at the factories in ongoing reliability tests (ORT). This approach led to a significant decrease in the customer product integration defect part per million (DPPM) and annualized failure rate (AFR) in the field as well as in the significant design and manufacturing cost reduction.

Speaker

Alexander Parkhomovsky is managing the Reliability Engineering and Materials Science group at Seagate Technology Motor Design Division. He has been with Seagate in increasing responsibility roles for 6 years. Alexander Parkhomovsky developed and implemented a number of disruptive cost saving design for reliability solutions and reliability tests at Seagate Technology and Seagate volume suppliers worldwide. Dr. Parkhomovsky’s areas of expertise include Reliability Management, Design for Reliability and Reliability Physics.

Prior to him joining Seagate, Alexander Parkhomovsky worked at Applied Materials where he developed advanced process control solutions for thin solid film processes. Alexander Parkhomovsky authored and coauthored numerous scientific publications. He is a co-inventor on 3 U.S. Patents. Dr. Parkhomovsky received his Ph.D. degree in Materials Science and Engineering from the University of Minnesota and his B.S. and M.S. in Chemical Engineering degrees from the D. Mendeleyev University of Chemical Technology, Moscow, Russia.

Alexander’s contact address is alexander.parkhomovsky@seagate.com

 

Top

Date

Feb 25, 2009

Topic

Best of RAMS

Abstract

The 55th Annual Reliability and Maintainability Symposium (RAMS).  Held in Fort Worth, Texas on January 26-29, 2009.  For those of you who can not attend the symposium, we will bring the highlights to you.  These are best papers presented during the 4 day RAMS event. The theme of this year’s RAMS was “Reliability as a Competitive Advantage – From Theory to Practice”. Information on RAMS is available on the web at http://www.rams.org/. The panel is being organized by Mike Silverman and Fred Schenkelberg.  If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Mike Silverman from Ops Ala Carte will lead the panel discussion.

Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs. He has 22 years of reliability and quality experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

 

Top

Date

Jan 14, 2009

Topic

Design for Reliability in the Semiconductor World

Abstract

One of the hottest phrases in reliability today is "Design for Reliability".
Very few people really understand what this means and even fewer
understand how to apply at the semiconductor component level.
In this presentation, I will address this issue by revealing the basic
building blocks of a DfR program for semiconductors. This does
not just apply to semiconductor companies, but also applies to any
company that is using custom chips and must manage their suppliers.

As with every other type of product, semiconductors require a carefully
laid out reliability plan that addresses each phase of the product
life cycle. When we do this, we can then integrate this process
into the development process to have a very cohesive reliability
program. Below, we have described the reliability life cycle process
for semiconductor products. As with other types of Design for
Reliability programs, we must address this as a phased approach:

In the Development Phase, we use concepts such as
Built-In Reliability (BIR), Design for Reliability (DfR), and Design for
Testability (DfT). Each of these areas will be covered in this
presentation.

Speaker

Mr. George Denes from Ops A La Carte

 

Mr. Denes is an associate consultant with Ops A La Carte.
George has over 30 years of semiconductor industry experience
as a reliability and product engineer. He worked with memory
products, ASIC, analog, digital and mixed signal circuits. He
was very involved with CMOS RF products as well. He is a hands-on
engineer, who qualified and released to manufacturing many
IC products. He worked at Intel, Motorola, National Semiconductor,
Signetics/Phillips, Teradyne, and at small start-up companies
(Plus Logic, Chip Express, Amalfi Semi). George designed all required
reliability testing hardware (burn-in boards, etc.) and planned and
directed all reliability qualification tests, as well as product
characterization and volume production testing. He directed all
failure analysis of IC devices with Test Engineering and FA laboratories.

 

Top

Date

Oct 29, 2008

Topic

Photovoltaic Systems Reliability

Abstract

Solar Energy is the ultimate clean energy source. However, solar energy is still more expensive than other means of energy production. Residential and commercial systems must be highly reliable, have low maintenance costs, and last a long time, for there to be effective payback. The talk will discuss the techniques used to ensure these goals are met.

Speaker

Jim Loman is the Department Manager for Design Reliability at Space Systems / Loral in Palo Alto. In this capacity, he manages Reliability Engineering, Parts Engineering, Product Development Quality and Software Quality. Prior to this, he worked for GE for over 20 years, in various positions including the manager of the Electronics Reliability and Quality Program at the GE Research Lab, and as the Technology Leader for Solar at GE Energy. He holds a Ph.D in Physics.

 

Top

Date

Oct 14, 2008

Topic

Soft Errors – No Way To Escape

Abstract

Soft Errors gained a lot of attention since the mid 90's when
contaminated packaging material caused massive device failures.
Since then a lot of improvements have mitigated most of the alpha
particle related soft errors. Due to the continuous technology
scaling, however, comic rays induced soft errors have gained
significantly and are nowadays the major threat to modern
semiconductor devices. Most vulnerable are SRAM devices since
they operate at high speed and relatively low storage node
charges. We will present the fundamentals of soft errors, their
sources in common semiconductors as well as the different available
mitigation techniques.

Every attendee will receive a free copy of "Soft Errors - History,
Trends and Challenges" published by J.Ziegler and H.Puchner.

Speaker

HELMUT PUCHNER received his Ph.D. degree from the Technical University
of Vienna, Austria, in Electrical Engineering in 1996. His Ph.D. thesis
adviser was Siegfried Selberherr the world-famous scholar and MINIMOS
inventer. He joined LSI Logic in Santa Clara in 1997 as TCAD device
development engineer. In 2002 he joined Cypress Semiconductor, where he
is currently Device Director responsible for transistor development, TCAD,
device reliability and ESD. His research interests include soft error
mitigation and power transistors including their reliability aspects.
He has published more than 80 conference/journal articles and hold 18 US
patents.

He is a Sr IEEE Member and is currently serving on the program committees
of IEDM (MT - memory technology)and IRPS (soft errors).

 

Speaker Contact: Manuj Rathor

 

Top

Date

Sept 24, 2008

Topic

Solid State Drive (SSD) Reliability

Abstract

This presentation will address three critical areas of solid state drive (SSD) reliability:  (i) single- and multi-event upsets; (ii) the influence of decreasing dimensions on the critical failure mechanisms of integrated circuits; and (iii) high density packaging and solder joint stability.  Event upsets will be addressed through review of existing industry data and identification of trends indicating potential risks in correcting these events.  Emerging research data and models will be presented covering the dominant integrated circuit failure mechanisms (time-dependent dielectric breakdown, hot carrier injection, negative bias temperature inversion, electro-migration, and soft errors due to alpha particle-induced damage), illustrating the sensitivity of each mechanism to feature dimensions and evidence that the current industry approach to predicting failure rate and time to failure may be inadequate.  Lastly, analysis of failure mechanisms for advanced packaging solutions (stacked die, silicon vias, emerging solder joint geometries & materials, and others) will be discussed, based on an understanding of interconnect degradation behavior and how the mechanical and material properties of high density packaging is limiting useful lifetime.

Speaker

John has 20 years of experience in reliability physics and materials characterization, covering a range of applications including biomedical devices, structural composites, microelectronics, and opto-electronics.  He has been consulting with DfR Solutions for over a year, and manages projects & business development for their clients on the West Coast and throughout the Pacific Rim.  He received his Ph.D. in Materials Engineering and performed post-doctoral work at UC Santa Barbara, and his B.S. in Materials Science & Engineering from UC Berkeley.

 

Top

Date

May 28, 2008

Topic

Electronic Prognostics (EP)

Abstract

In today's world of eCommerce, down time for enterprise servers in business-critical 
datacenters costs millions of dollars per hour.  The System Dynamics, Characterization and 
Control group at Sun Microsystems has pioneered new proactive fault monitoring innovations 
for enhancing the reliability, availability, and serviceability of computer servers.  The key 
enabler for Electronic Prognostics is a patented continuous system telemetry harness (CSTH), 
implemented in software, which collects time series signals relating to the health of 
dynamically executing servers and their components, network interconnects, and peripherals.  
These time series provide quantitative metrics associated with physical variables (distributed 
temperatures, voltages, and currents throughout the system), performance variables (loads, 
throughputs, queue lengths, etc.), and various quality-of-service (QOS) metrics.  The CSTH 
signals are continuously archived to an offline circular file (i.e. the "Black Box Flight 
Recorder"), and are also processed in real time using advanced pattern recognition for proactive 
anomaly detection.  The pattern recognition provides sensitive early detection of a variety of 
mechanisms that are known to cause downtime in enterprise datacenters, including:
  Environmental issues (thermal anomalies, air-flow restrictions, degraded fan motors); 
  Software aging phenomena (memory leaks, resource contention); 
  Degraded/failed sensors; 
  Degradation of power supplies, capacitors, and interconnects; 
  and "inferential sensing" capability 
    (wherein a failed sensor is replaced with a highly accurate analytical estimate).  
 

Sun Microsystems' CSTH coupled with advanced pattern recognition techniques adapted from

the commercial nuclear and aerospace industries are helping to increase component reliability

margins, system availability goals, and optimal energy utilization for enterprise computing

datacenters.

 

Speaker

Kenny Gross received his Ph.D. in nuclear engineering from the U. of Cincinnati in 1977. Kenny

is a Distinguished Engineer for Sun Microsystems and is team leader for the System Dynamics

Characterization and Control team in Sun's Physical Sciences Research Center in San Diego. 

Kenny specializes in advanced pattern recognition, continuous system telemetry, and dynamical

system characterization for improving the reliability, availability, and serviceability of

enterprise computing systems.   Kenny has 194 US patents issued and pending, 168 scientific

publications, and was awarded a 1998 R&D 100 Award for one of the top 100 technological

innovations of that year, for an advanced statistical pattern recognition technique (MSET) that

was originally developed for nuclear and aerospace applications and is now being used for a

variety of applications to improve quality, availability, and energy efficiency for enterprise

computer servers. 

 

Top

Date

April 30, 2008

Topic Reliability Performance & Measurement of Repairable Systems.
Abstract

Many multi-component systems such as automobile, computer servers, production & industrial equipment, appliance, engines, and power systems are generally repairable in service. It is also known that reliability performance metrics and analysis methodologies for non-repairable systems versus repairable systems are quite different. You would be surprised that the reliability theory for repairable systems is so underutilized. Most reliability engineering practices for repairable systems were directly adopted from methodologies for non-repairable systems, which are “well” developed. Many assumptions are commonly used, consciously or unconsciously, in engineering practices such as exponential distribution for the “time between failures” (TBF) and even for the “time to restore” (TTR). 

Traditional reliability theory (which primarily applies to non-repairable systems) is built upon the fundamental (random) variable of the time to failure (TTF). The TTF distribution defines all reliability metrics mathematically and practically. Similarly should the behavior of both time between failures (TBF) and time to restore (TTR) together describe the reliability performance of a repairable system? 

Because of the complexities associated with modeling of repairable system reliability, most research studies have only focused on evaluating the limiting statistical values such as steady-state system Availability, steady-state system MTBF, and etc. These limited evaluations, in turn, are typically based on one of the stationary Stochastic Point Processes such as Homogeneous Poisson Process, Non-Homogeneous Poisson Process, Renewal Process, Markov Process, and Regeneration Process - basically a certain level of “renewal” at the system level. In reality, maintenance activities, such as repairing, refurbishing and replacing, are really taking place at the module level. Reliability performance of a multi-component repairable system is actual an assembly of realizations for of all components. 

This seminar will be started with some case studies on the behavior of both TBF and TTR of simple repairable systems and of multi-component repairable systems, from which some simple conclusions are drawn and should be applied directly in engineering practices. Reliability performance and measures for repairable systems are then presented and discussed.

Speaker

Dr. Wendai Wang is a senior member of technical staff of Applied Materials, where he is leading design-for-reliability for new product development. Prior to this position, he was the Reliability Technical Leader at General Electric, where he had successfully led many innovative “design-for-reliability” projects and the GE Reliability Council as well. He received his B.S. and M.S. in electro-mechanical engineering from Shanghai Jiaotong University and his Ph.D. in reliability engineering from the University of Arizona. He has about 20-year industry and research experience in Reliability Engineering. He is the author of over 30 publications and invention disclosures and His work and research area includes DFR methodology and process, reliability modeling and analysis, reliability testing, mechanical and electronics reliability, physics of failure, and reliability training. Wendai is also an active member of the RAMS Management Committee and is currently the Reliability Engineer Society (SRE) Silicon Valley Chapter Vice President.

 

Top

Date

Mar 26, 2008

Abstract

Real-world problems motivate engineering leadership: Teams must overcome real-world challenges and constraints in life test and operational maintenance.  Therefore Ops A La Carte has invented a practical new technology, using a "Leading Indicator", and a patent is pending.  Leading indicators can improve common challenges and constraints:

-There are too few specimens and too little time available for life testing.

-For a life test to be meaningful, the acceleration must be mild.

-Life testing results are too late to improve product development.

-Maintenance ought to be based on the real-time status of each specific unit.

-By contrast, engineering methods typically describe the average status of a population of similar units in similar operation.

Leading indicators can provide advanced warning, can improve Accelerated Life Tests, Manufacturing Screening, and Operational Maintenance.

If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

·  Arthur Zingher is a Senior Reliability Consultant with Ops A La Carte. Previously, he was a Distinguished Engineer at Sun Microsystems, focused on computer hardware. Earlier, he was a Research Staff Member at IBM, Yorktown NY. His education included a Ph.D. in Physics from U.C. Berkeley and a B.A. in Physics & Math from Columbia.

Arthur is a versatile physicist, engineer and inventor, with expertise and accomplishments in: Test & reliability, instrumentation, math; Electronic packaging, cooling, electronics, mechanics; Manufacturability and rapid technology troubleshooting for very acute commercial problems. This included successes in products and factories, plus more than 33 issued patents.

Also, Arthur is active in Highly Concentrated Solar Photovoltaic Power generation.

 

Top

Date

February 27, 2008

Topic

Best of RAMS

Abstract

The 54th Annual Reliability and Maintainability Symposium (RAMS) was held in Las Vegas on January 28-31, 2008. For those of you that couldn't attend, we will bring the symposium to you (except for the slot machines). This evening offers highlights of the best papers presented at RAMS during this 4 day event. The theme of this year’s RAMS was “Dawn to Dusk – Life Cycle Prescriptions”. Information on RAMS is available on the web at http://www.rams.org/. The panel is being organized by Mike Silverman and Fred Schenkelberg. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Panel, RAMS attendees

Vita

Mike Silverman from Ops Ala Carte will lead the panel discussion.

Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs. He has 22 years of reliability and quality experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

 

Top

Date

January 23, 2008

Topic

Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 30th ISTFA was held November 4-8, 2007, in San Jose. Information on ISTFA is available on the web at http://www.asminternational.org/istfa/. The January Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers. We are looking for additional panel members, especially ISTFA attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Panel

Vita

Mike Silverman from Ops Ala Carte will lead the panel discussion.

Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs. He has 22 years of reliability and quality experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

 

Top

Date

October 24, 2007

Topic

Formation of a Warranty Chain Management Institute and its Applicability for Reliability Engineers

Abstract

Warranty costs in the US alone run in the region of $28B per annum [ref - Warranty Week, 3rd March 2007 edition]. Reliability Engineers have a significant influence on the failure rates of equipment, which is a key driver of warranty events and hence cost. Glen will outline the path he has taken, beginning with an investigation into how to improve reliability engineering practices in Hewlett Packard, that led to the creation of a new Warranty Conference series and culminated in the formation of the Institute of Warranty Chain Management (iWCM), of which he is currently President. Along the way, he will discuss the iWCM’s applicability and usefulness to reliability Engineers and introduce the Director of the Warranty Chain Management Conference series, Alison Griffiths.

Speaker

Glen Griffiths and Allison Griffiths, HP

Vita

Glen Griffiths is the director of Hewlett Packard’s Global Engineering Services responsible for providing engineering and regulatory support across all HP hardware businesses. He manages over 180 people spread across 24 countries with his teams supporting over $60B of products sales annually.

Glen’s professional experience has centered on electrical, avionic and reliability engineering as well as systems engineering. Glen retired from the UK Royal Air Force, after serving 22 years as an Engineering Officer. In his previous roles he was operations manager for a squadron of Jaguar strike attack aircraft, managed the software development and test teams for the Harrier aircraft (AV8B) fleet and managed a multi-national team of software reliability and engineering R&D advisors for the Typhoon aircraft. During his last 3 years in the military he was responsible for setting Reliability & Maintainability requirements for all United Kingdom Military Air systems procurement and he also acted as the UK reliability specialist advisor to the US Department of Defense Joint Strike Fighter Project.

Glen holds a Masters in Business Administration, a Masters in Reliability and Maintainability Engineering and an Honors degree in General Engineering. He is a Chartered Engineer in the IEEE and also holds the position of President of the Institute of Warranty Chain

Alison Griffiths is the President of the business management consultancy ALG Associates, LLC, which she originally founded in the UK in 2002, transferring to the US in 2004. Alison has 15 years of business, management and consultancy experience; having worked in the public and private sector, manufacturing, retail and customer service industries. She has led a number of key process improvement initiatives and has key experience in assessing organizational needs, developing strategies and improvement plans, problem and conflict resolution, as well as staff training and development.

In 2004 Alison launched the Warranty Chain Management (WCM) series of conferences to address the important need for a forum where professionals can meet to discuss warranty issues and begin to develop warranty management as a recognized business discipline. Following a call to action for the development of a recognized warranty institute at the WCM 2006, Alison was instrumental in forming and serving on a Charter Team to create the Institute of Warranty Chain Management (iWCM). She incorporated the iWCM in California in December 2006 and now serves as the Executive Director to the Board of Directors.

Alison studied at Manchester Metropolitan University, UK and has a BA(Hons) in Business Studies and an MBA.

 

Top

Date

September 26, 2007

Topic

Early Reliability Testing

Abstract

Almost every industry is competing on Reliability and therefore, there is a need to develop more reliable products faster. However, reliability testing and improvement often are performed when development is nearly complete, when the product is nearly frozen, when time is short, and when improvements are more difficult and costly. Instead, Early Reliability Testing is a development tactic that can provide higher reliability and quality, with less cost and time for development, and less development risk.

The big drawbacks to early testing are: low test coverage, low number of samples, and samples with immature manufacturing processes. However, all of these issues can be addressed and overcome, and when this happens, the benefits of early life testing are substantial.

In this presentation, we shall illustrate many of the techniques we use for Early Reliability Testing, point out all of the benefits and cost-savings, and show how to work around the potential issues.

Speaker

Mike Silverman and Arthur Zingher, Ops A La Carte

Vita

·  Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs. He has 22 years of reliability and quality experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

·  Arthur Zingher is a Senior Reliability Consultant with Ops A La Carte. Previously, he was a Distinguished Engineer at Sun Microsystems, focused on computer hardware. Earlier, he was a Research Staff Member at IBM, Yorktown NY. His education included a Ph.D. in Physics from U.C. Berkeley and a B.A. in Physics & Math from Columbia.

Arthur is a versatile physicist, engineer and inventor, with expertise and accomplishments in: Test & reliability, instrumentation, math; Electronic packaging, cooling, electronics, mechanics; Manufacturability and rapid technology troubleshooting for very acute commercial problems. This included successes in products and factories, plus more than 33 issued patents.

Also, Arthur is active in Highly Concentrated Solar Photovoltaic Power generation.

 

Top

Date

May 30, 2007

Topic

Designed Experiments and Reliability

Abstract

Generally, the goal of a reliability analysis of lifetime data is to develop an estimate of how failures occur over time for a family of stress conditions. In either software or hardware reliability, experiments can be performed to determine if two or more alternatives (of e.g.: process, procedures, or design) may lead to significantly different reliability in a product. The speaker first presents a type of designed experiment for a very unusual setting -- choosing a procedural behavior among all possible procedural behaviors to optimize a general binary outcome to a 1 state (success or win) in the presence of multiple unknown countering procedures attempting to force a 0 state (failure or loss).

Initially, it's not at all clear that the subject is suitable for designed experiments of any kind due to extremely long runtimes, an environment which actively changes over time, and that there are typically between 2^100 to 2^3000 likely micro-behaviors at each stage of approximately 400 stages, most likely all correlated to some extent in both instance and across time, which may be considered the X's. To make matters even worse, the Y for which it is desired to optimize the behavior is not at all obvious. Should this setting be considered a reliability problem (failures do occur in time) or some other kind of experiment? The speaker shows how to cut through the muck and redefine the problem in a tractable manner.

The subject of the experiment is: How to win a "turns" based computer game -- at the highest level of the game -- in just 8 "runs" of the game. Given that a single game could take weeks to complete, 8 runs is very "expensive." Is it possible to reduce the "cost?" The concepts behind setting up a similar type of experiment with multiply censored data will be addressed and an analysis will be demonstrated. Finally -- a few comments will be made on better methods for software and hardware design using statistical procedures.

Speaker

Ed Russell, National Semiconductor

Vita

Ed Russell is a statistician at National Semiconductor in the SPICE Modeling group. He is currently working on analysis of electrical design rules and corner models, and is beginning implementation of statistical procedures for analog design. Prior to working at National, Ed was a statistician at Sun Microsystems where he developed and taught statistical procedures for 65nm and 45nm CMOS design. Before joining Sun, he served as the Director of Reliability for Cypress Semiconductors and, in addition, provided company wide statistical consulting with a general focus on test, product characterization, and process development for new technology. Ed also has held various management and individual contributor roles with AMD in both Austin and Sunnyvale working principally in reliability and yield improvement. While at AMD, Ed set up several SPC programs in the product lines, test, and logistics and served as the statistical programs manager at Sematech on assignment.

In addition to his role as a statistician in the semiconductor industry, Ed was Chevron’s and Gulf Oil’s expert in 3D signal processing, worked on evaluating the performance of the US High Level Nuclear Waste Repository, and has been heavily involved in software development and systems analysis. Ed received an MS degree in Mathematical Statistics from Purdue University with a concentration in Tests of Hypothesis and Decision Theory. After graduation, he continued his education with a focus on Biostatistics at the University of Washington, Multivariate Statistics at Ohio State, and Geophysics at The University of Houston.

 

Top

Date

April 25, 2007

Topic

Mean Time to Data Loss: A Poor Choice for Assessing RAID Reliability

Abstract

Often times, the reliability of a Redundant Arrays of Inexpensive Disks (RAID) is estimated using an equation for the mean time to data loss (MTTDL), a closed form expression for N+1 redundancy with repair. However, MTTDL has two inherent deficiencies that render it extremely inaccurate. This presentation will discuss the statistical concepts that are implicit (required) for the MTTDL to be accurate and provide data to illustrate how these requirements are not met when modeling RAID systems. Secondly, the model implied by MTTDL allows only one failure rate and one repair rate for all hard disk drives in a RAID group. Data will be presented to show that latent defects and defect scrubbing must also be included in the model. An improved modeling method will be offered which measures reliability in terms of the number of double disk failures as a function of time. These are plotted using mean cumulative failure functions..

Speaker

Jon Elerath, Network Appliance

Vita

Dr. Elerath received his B.S.M.E. and M.S. in Reliability Engineering from the University of Arizona, and his Ph.D. from the University of Maryland. He has over 30 years experience in reliability engineering and engineering management at General Electric Co., Tandem Computers, Compaq, IBM and Network Appliance, Inc. Dr. Elerath has experience in all aspects of reliability in a commercial environment, including reliability goals, specifications, predictions, modeling, trade-off studies, testing and data collection and analysis. Most of his last 17 years have focused on reliability of hard disk drives. He chaired the Redwood Empire Section of ASQ and has been the Chairman of the Reliability Committee of the International Disk Drive Equipment Materials Association (IDEMA) for over 10 years. He participated greatly in developing IEEE-1413.1, a guide for reliability predictions, and has published 23 papers on reliability, 14 of them relating to hard disk drives.

 

Top

Date

March 28, 2007

Topic

Design Traits of Effective Reliability Programs

Abstract

Having the privilege to interview a cross-section of more than 80 product development teams to understand their reliability program has led to a few observations. Only a rare, few organizations have a mature, cost effective and efficient reliability program. A clear understanding of your organization's reliability program and a clear vision of what is possible is the crucial first step to making systematic program improvements. This talk explores the key traits which separate good from great reliability programs. Marketing, product volume, complexity and organizational structure do not tend to matter. A proactive, statistical thinking, fact based decision making and integrated reliability tools do tend to make a difference. This talk outlines how to assess your organization, provides highlights of key traits of merely good and simply great reliability programs.

Speaker

Fred Schenkelberg

Vita

Fred joined the ranks of independent consultants to focus on reliability engineering in 2004. He currently works with wide variety of clients using reliability assessments as a starting point to develop detailed reliability plans and programs. Also, he uses his reliability engineering and statistical knowledge to design and conduct accelerated life tests. Fred previously worked at HP starting in Vancouver, WA. He later joined HP's ESTC division in Palo Alto, CA where he co-founded the HP Product Reliability Team. In that role, Fred was responsible for the consulting, training, and community building aspects of HP's Product Reliability Program. He was also responsible for research and development on selected product reliability management topics at HP. Prior to joining HP's ESTC division, Fred worked as a design for manufacturing engineer on HP's DeskJet printers. Before HP, Fred worked with Raychem Corporation in various positions, including research and development of accelerated life testing of polymer based heating cables. Fred has a Bachelors of Science in Physics from the United States Military Academy and a Masters of Science in Statistics from Stanford University. Fred is an active member of the RAMS Management Committee and is currently the IEEE Reliability Society Santa Clara Valley Chapter Vice President and ASQ Reliability Division Treasurer.

 

Top

Date

March 7, 2007

Topic

Best of RAMS

Abstract

The 53nd Annual Reliability and Maintainability Symposium (RAMS) was held in Orlando on January 22-25, 2007. For those of you that couldn't attend, we will bring the symposium to you (except for the Disney World visit). This evening offers highlights of the best papers presented at RAMS during this 4 day event. The theme of this year’s RAMS was “Improving Products and Processes Through Information”. Information on RAMS is available on the web at http://www.rams.org/. The panel is being organized by Mike Silverman and Fred Schenkelberg. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Mike Silverman and Fred Schenkelberg

Vita

 

 

Top

Date

January 31, 2007

Topic

Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 32nd ISTFA was held November 12-16, 2006, in Austin. Information on ISTFA is available on the web at http://www.asminternational.org/istfa/. The January Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers. We are looking for additional panel members, especially ISTFA attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org .

Speaker

Art Rawers and Panel

Vita

Arthur G. Rawers, is the Manager of the WW Device Analysis (FA) Laboratories at Xilinx, Inc. With laboratory facilities in San Jose, California; Dublin, Ireland; and in the near future Singapore, the group does diagnostic, electrical, and physical analysis of the semiconductor products designed, developed and produced by Xilinx.

He received a BSEE from the University of Illinois (UIUC) with an emphasis on semiconductor physics. Over the years, he has worked for a variety of companies; large and small, with fab and fabless, public and start-up. In 1979 as a reliability engineer at Texas Instruments in Dallas, he was responsible for device qualifications of internally developed, application specific, custom linear and digital VLSI devices for Military Systems. The lure of Silicon Valley brought him West in 1985 to a start-up semiconductor manufacturer to start their Reliability and FA departments. Now four companies later he continues to be involved with product reliability issues and device failure analysis. His curiosity and intrigue to understand how strong a product is, how it can be improved, what it takes to make it fail, and determining how it failed, has not waned over the years.

Art has been a member of the IEEE since college. He has been an active member of the IEEE Reliability Society serving in a variety of officer capacities for the Local RS Chapter, served as an elected member of the IEEE RS EXCOM Board, chaired the technical program committee (failure analysis) for IRPS, and held various management roles in the organization that runs the annual IEEE IRPS conference.

 

Top

Date

October 25, 2006

Topic

Trapped by MTBF?

Abstract

Mean Time Between Failure, MTBF, is the worst four letter acronym. It leads to misunderstanding, misinterpretation and misinformed decisions. And, MTBF is widely used. It is embedded in countless industries and ‘the way’ of discussing reliability. As you already know, MTBF is the parameter for the exponential life distribution and it has common estimation techniques. MTBF is the key element of modeling, planning, test development and vendor selection among many other elements.

In this presentation, let’s review the common issues around MTBF and how these problems have led to significant errors. Then let’s explore what to do given the organization and industry will continue to use MTBF. What questions should you ask? How should you clearly explain the issues and proper use of MTBF and related probabilities? And, what basic calculation should you conduct every time you run across MTBF?

Within the world of reliability engineering and statistics there has rarely been a more widely difficult metric to properly understand. Many engineers and managers do not take the time to really understand basic statistics. With very simple formulas, arguments, and definitions, we can help our respective industries advance product reliability. Used properly, there is nothing wrong with MTBF. Let’s talk about how to use MTBF properly and encourage others to do the same.

Speaker

Fred Schenkelberg, OPS A La Carte and FMS Reliability

Vita

Fred Schenkelberg is a senior consultant/technical director at both FMS Reliability and OPS A La Carte. Fred joined the ranks of independent consultants to focus on reliability engineering in June 2004. He currently works with wide variety of clients using reliability assessments as a starting point to develop detailed reliability plans and programs. Also, he uses his reliability engineering and statistical knowledge to design and conduct accelerated life tests.

Fred previously worked at HP starting in February 1996 in Vancouver, WA. In January 1998, he joined HP’s, Electronic Systems Technology Center, ESTC, in Palo Alto, CA where he co-founded the HP Product Reliability Team. In that role, Fred was responsible for the consulting, training, and community building aspects of HP's Product Reliability Program. He was also responsible for research and development on selected product reliability management topics at HP.

Fred has a Bachelors of Science in Physics from the United States Military Academy and a Masters of Science in Statistics from Stanford University. Fred is an active member of the RAMS Management Committee and is currently the IEEE Reliability Society Santa Clara Valley Chapter Vice President and ASQ Reliability Division Treasurer. Contact Fred at www.fmsreliability.com

 

Top

Date

Sept 27, 2006

Topic

Lot Acceptance Test: A Viable Solution for Parts Incoming Inspection

Abstract

Modern electronic equipment involves assembly of complex parts, provided by many original suppliers. Ensuring the quality of these parts is essential for ensuring the quality of the end product. Yet, classic incoming inspection that would require physical inspection of the parts received is practically impossible due to the complexity of the parts, the excessive cost of testing equipment, and the scarcity of qualified testing personnel at the point of assembly. For these reasons, many manufacturers are forced to rely on the outgoing inspection provided by the supplier, which in many circumstances is proving to be inadequate for reasons such as lack of knowledge on the supplier site regarding critical parameters, errors in equipment settings, or even administrative errors in shipments and deliveries.

Considering that the defective parts are normally associated with periods in which the manufacturing process was less controlled, it is expected that the defective parts are not randomly distributed, but are concentrated in a small number of manufacturing lots. Identification and rejection of these lots at the incoming inspection point would significantly improve the supply quality.

Lot acceptance testing is a logical and relatively inexpensive solution to this problem. With lot acceptance testing, we take a sample of parts from each lot and verify the variations of certain parameters that are easy to monitor and are known to correlate to the critical parameters that are important for the application. In many cases the supplier already collects the data required for such an analysis, and no additional tests are required.

Speaker

Sorin Witzman, Ops A La Carte

Vita

Sorin Witzman has a Bachelor in Mechanical Engineering, Master of Engineering in Chemical engineering from the Polytechnic Institute of Bucharest and a Master of Science in Heat and Mass Transfer from Technion Haifa. He has numerous publications in the field of electronics packaging, thermal management and physics of failure of electronics. He is also coauthor of numerous reliability standards presently used by the electronic industry. Sorin is the Technical chairman for IMAPS, International Microelectronics And Packaging Society, NorCal chapter. Previous experience includes work with US military industry, and management of quality and reliability groups at Nortel Networks, Spectrian/Remec, EiC Corp.

 

Top

Date

May 31, 2006

Topic

Failure Analysis: Benefits, logistics, and limitations

Abstract

Failure analysis represents an important process regarding product improvement. The understanding of the root cause for the component failure is the most important step in product quality and reliability improvement, and properly performed failure analysis is an important part of this process.

Results and conclusions of an extensive research of the failure analysis results performed on a large population of field failures in fielded telecommunication equipment are presented. The results show the presence of trends that if recognized in time can be used to achieve significant product improvements. The presentation shows the role of failure analysis during the qualification program, in process failure, and ORT (Ongoing Reliability Test) failure.

Even though an important part of the process, failure analysis alone rarely can produce the desired quality improvement expected. The long cycle time (normally longer than 6 months between the failure and the time the results are in), high cost, lack of resolution, limit the use of classic failure analysis process in solving quality problems. The authors investigate other methods that can be more effective and less costly than "end of the line" failure analysis in improving the product reliability, yield improvement, and field failure reduction.

Speaker

Fred Schenkelberg & Sorin Witzman, Ops A La Carte

Vita

Fred Schenkelberg has a Bachelors of Science in Physics from the United States Military Academy and a Masters of Science in Statistics from Stanford University. Fred is an active member of the RAMS Management Committee and currently the local IEEE Reliability Society Chapter Vice Chair. Previously he worked with Raychem and at various groups within HP, including ETSC and DeskJet.

Sorin Witzman has a Bachelor in Mechanical Engineering, Master of Engineering in Chemical engineering from the Polytechnic Institute of Bucharest and a Master of Science in Heat and Mass Transfer from Technion Haifa. Sorin is the Technical chairman for IMAPS, NorCal chapter. Previous experience includes US military industry, Nortel Networks, Spectrian/Remec, EiC Corp. management of quality and reliability groups

 

Top

Date

April 26, 2006

Topic

COMPETITIVE TEARDOWN ANALYSIS

Abstract

When introducing a product into a new market, determining the current market players' reliability performance may lead to a competitive advantage. Or, if your competition is using reliability as a marketing lead, does your product match their performance or do they have the advantage? This presentation will cover 2 analysis techniques that can be used to provide you with information critical to obtaining a competitive advantage.

Speaker

Doug Farel

Vita

Doug is a Senior Reliability Consultant and Course Instructor with Ops A La Carte. Doug Farel has over 20 years of experience in reliability engineering and has worked on medical, communications, mainframe processor and storage technology products. He has extensive experience in reliability analysis, reliability testing and field reliability/customer feedback programs. Doug has a BS degree from the United States Military Academy at West Point, NY; an MSEE degree from Georgia Tech in Atlanta; and an MS degree in reliability from the University of Arizona in Tucson. He is a Certified Reliability Engineer (CRE) through the American Society for Quality (ASQ) and currently teaches the local CRE Preparation Course. He is also a member of IEEE and has been chosen to give this same talk at the IEEE TC-7 (a technical committee of the CMPT group) workshop on Accelerated Stress Testing in October.

 

Top

Date

March 29, 2006

Topic

Design for Warranty (DfW) Cost Reduction

Abstract

Warranty costs for the computer and other high tech industries were greater than $6.2 billion in 2004. Less than 35% of that was material costs; the majority was related to support process costs. Consequently, product design teams are being challenged to reduce the warranty costs by both improving product reliability (AFR) and designing in less expensive warranty support processes (i.e., customer self diagnostics and repair). This 1 day seminar will introduce a proven warranty event cost model and supporting methodologies that help teams identify warranty cost reduction solutions which integrate both component fail rate reduction strategies and strategies that shift the support process mix to less expensive processes. The warranty cost model constructs the cost of warranty events from both event frequencies and the specific support process costs used to resolve each event. Case studies will be presented describing how teams used this model to: analyze warranty costs by event type, prioritize what events needed cost reduction options, and develop cost saving estimates for each prioritized event. In the afternoon the class will explore and practice using methods to rank the feasibility of proposed design options by identifying all diagnostic tools, product features and support tools/capabilities needed to realize its potential cost savings. Finally, how these models and tools integrate with FMEA and other reliability planning tools will be explored.

Speaker

Robert Mueller

Vita

Bob Mueller is a senior consultant/program manager at both OPS A La Carte and the Marisan Group. He is a product development professional with 30+ years of technical and management experience in software intensive product development as well as in R/D process and quality systems development including extensive consulting experience with cross-functional product development teams and senior management. After receiving his M.S. in Physics in 1973, Bob joined Hewlett-Packard in Cupertino, CA in IC process development. In the next three decades before leaving hp, he held numerous positions in R/D, R/D management and technical consulting including management positions in the computer analytical, healthcare business units and in HP's internal engineering consulting organization. For the past half decade Bob has focused his consulting on both product development and product support processes and practices that drive warranty costs and customer satisfaction.

 

Top

Date

March 1, 2006

Topic

Best of RAMS

Abstract

The 52nd Annual Reliability and Maintainability Symposium (RAMS) was held in Newport Beach on January 23-26, 2006. For those of you that couldn't attend, we will bring the symposium to you. This evening offers highlights of the best papers presented at RAMS during this 4 day event. The theme of this year's RAMS was "The Role of Reliability and Maintainability in Managing Risk". Information on RAMS is available on the web at http://www.rams.org/. The panel is being organized by Mike Silverman and Fred Schenkelberg. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org .

Speaker

Mike and Fred

Vita

.

 

Top

Date

January 18, 2006

Topic

Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 31th ISTFA was held November 6-10, 2005, in San Jose California. Information on ISTFA is available on the web at http://www.istfa.org/

The January Santa Clara Valley IEEE Reliability Society meeting is a joint meeting with the Santa Clara Valley IEEE ESD Society. The joint meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers. We are looking for additional panel members, especially ISTFA attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org .

Speaker

Art Rawers and Panel

Vita

Arthur G. Rawers, is the Manager of the WW Device Analysis (FA) Laboratories at Xilinx, Inc. With laboratory facilities in San Jose, California; Dublin, Ireland; and in the near future Singapore, the group does diagnostic, electrical, and physical analysis of the semiconductor products designed, developed and produced by Xilinx.

He received a BSEE from the University of Illinois (UIUC) with an emphasis on semiconductor physics. Over the years, he has worked for a variety of companies; large and small, with fab and fabless, public and start-up. In 1979 as a reliability engineer at Texas Instruments in Dallas, he was responsible for device qualifications of internally developed, application specific, custom linear and digital VLSI devices for Military Systems. The lure of Silicon Valley brought him West in 1985 to a start-up semiconductor manufacturer to start their Reliability and FA departments. Now four companies later he continues to be involved with product reliability issues and device failure analysis. His curiosity and intrigue to understand how strong a product is, how it can be improved, what it takes to make it fail, and determining how it failed, has not waned over the years.

Art has been a member of the IEEE since college. He has been an active member of the IEEE Reliability Society serving in a variety of officer capacities for the Local RS Chapter, served as an elected member of the IEEE RS EXCOM Board, chaired the technical program committee (failure analysis) for IRPS, and held various management roles in the organization that runs the annual IEEE IRPS conference.

 

Top

Date

October 26, 2005

Topic

ESD Qualification Testing Needs to Grow Up

Abstract

Reliability testing of modern IC's remains based on measurements of real ESD events collected over twenty yeas ago. ESD qualification standards and specifications claim to be accurate simply because of the percentage of field returns hasn't increased. Our measurements of the real events and plans to improve ESD test parameter will become acceptable when the cost of ESD failures is identified in dollar cost to the industry each year.

Speaker

Jon Barth, Barth Electronics

Vita

Jon Barth earned his BS Degree in Electronic Engineering from Valparaiso Technical Institute in 1959 and has concentrated on the design of test equipment since then. He founded Barth Electronics in 1964 and has manufactured accurate high voltage, sub-nanosecond pulse components sold to government laboratories throughout the world. In 1994, Barth Electronics expanded into the design and manufacture of ESD test equipment. He has been a member of the ESDA; Standards Committees for device and system ESD for eleven years. The Barth commercial TLP system is used for ESD design throughout the world. He is a member of the APS, EOS/ESD Association, SEMI and life member of the IEEE.

 

Top

Date

September 28, 2005

Topic

Built-In Soft Error Resilience for Robust System Design

Abstract

Radiation induced logic soft errors in flip-flops and combinational logic pose a major challenge in the design of robust systems for enterprise computing and networking applications. In the past, soft errors were of concern especially for space applications. Increasing system-level soft error rates in advanced technologies, and stringent system data integrity requirements demand special design techniques to protect systems from logic soft errors. I discuss the impact of technology scaling on soft error rates, evaluation of run-time behaviors of systems in the presence of soft errors, and design of robust architectures incorporating Built-in-Soft-Error-Resilience (BISER) techniques. Design-for-test and debug resources are reused for soft error protection during normal system operation, resulting in 20-fold reduction in flip-flop soft error rate, with negligible area and speed impact, and 3-5% system-level power overhead. In comparison, classical redundancy techniques introduce 40-100% power, performance and area overheads.

Speaker

Subhasish Mitra, Intel Corporation

Vita

Subhasish Mitra is a Principal Engineer at Intel Corporation where he is responsible for developing enabling technologies for robust system design -- Design for Reliability, Testability and Debug -- in advanced technologies. He is also a Consulting Assistant Professor in the Electrical Engineering Department of Stanford University, and the Associate Director of the Stanford Center for Reliable Computing. Before joining Intel, he led the Stanford project on "Reliability Obtained by Adaptive Reconfiguration" sponsored by DARPA as part of the Adaptive Computing Systems program. He also consulted for several companies including the Agilent Technologies Laboratories. His research interests include robust system design, VLSI design and test, CAD, fault-tolerant computing and computer architecture. He received Ph.D. in Electrical Engineering from Stanford University.

Dr. Mitra has published more than 70 technical papers in leading conferences and journals, and invented design and test techniques that have seen wide-spread proliferation in the industry. He has received several awards, including the 2005 IEEE Circuits and Systems Society Donald O. Pederson Award for the Best Paper published in the IEEE Transactions on CAD and the 2004 Intel Achievement Award, Intel's highest corporate award, "for the development and deployment of a breakthrough test compression technology."

 

Top

Date

September 14, 2005

Topic

Kirkendall Voids in Lead-Free Solder Joints: A Reliability Issue

Abstract

Previous studies demonstrate extensive Kirkendall voids at the interface of a solder joint to a copper substrate, and their significant effects on the impact and shock strength of the solder joints. This talk focuses on two issues: the condition for the void formation; and the effect of voids on solder joint reliability. Samples of electronic assemblies of different packages aged or thermal-cycled were cross-sectioned by either FIB or sputtering etching. The results show that voids at the Cu/solder interface formed extensively in some cases, but not so much in others. So far, we are not clear exactly what factors control the void formation; it seems that the Cu plating process and the small concentration of Ni in either the solder or the substrate influences the void density and distribution. Shock strength at 400G of BGA packages aged for 20 days at 125°C did not degrade; the failure occurred by either delamination at the fiber/resin interface underneath the non-solder-mask-defined Cu pads, or inside the solder where they were close to the solder-mask-defined Cu pads. We also curve-fitted the result of voids growth vs time at different temperatures, to use it for prediction of the voided area at the product's service condition.

More Information: www.cpmt.org/scv/

Speaker

Zequn Mei, Cisco Systems

Vita

 

 

Top

Date

May 25, 2005

Topic

Best of ARS (Applied Reliability Symposium)

Abstract

The meeting will feature a review of papers and presentations give at the recently held ARS in San Diego. The meeting will be moderated by Mike Silverman and Fred Schenkelberg who will also share the presentation they made at ARS. To kick off the meeting we will be treated to a presentation that was presented at ARS by Dr. David Trindade. This will be followed up with discussion of other papers from the Symposium.

·  Simple Plots for Monitoring Field Reliability, by David

Computer servers are repairable systems consisting of processing hardware, operating systems and application software, storage and a variety of third party subsystems. Monitoring the field reliability of a complex system does not require complex reliability models. Reliability practitioners often attempt to model Non Homogeneous Poisson Processes with complex parameter estimation techniques. However, management as well as support engineers who maintain the system are easily intimidated by NHPP techniques. Methods based on the mean cumulative function (MCF) are simple and easily understood by many. Called time dependent reliability (TDR) at Sun Microsystems, such methods have been successfully used to estimate and monitor the reliability of repairable systems such as computer servers and storage arrays.

·  Reliability Integration Across the Product Life Cycle, by Mike & Fred

Good engineers naturally consider reliability aspects of product design and manufacturing. Management teams typically fully specify cost, performance and time-to-market criteria, building a fairly complete product development and manufacturing plan. However, the reliability aspects of this plan are usually very short, incomplete and not tailored to the specific product. Having well-defined goals with appropriate metrics is an important first step to achieve overall product and reliability objectives.

Once our goals are defined, we must then choose a set of reliability tools and techniques to achieve our goals. Choosing the appropriate reliability tools and techniques involves understanding the basic market and technology driven reliability constraints, along with an appreciation of the benefits of a wide range of reliability tools and techniques.

Because each product and company is different, reliability programs must be tailored to each situation. Generally speaking, there are two approaches to achieving product reliability objectives: 1) Accelerated Techniques such as HALT & HASS; and 2) Classical Techniques such as Predictions and Verifications. A good reliability program requires a balance between the two approaches and a subset of reliability tools and techniques from each.

Next we must seamlessly and cohesively integrate all of the reliability tools and techniques together that we have chosen to maximize reliability at the lowest possible cost.

This presentation provides an outline of how to quickly assess your product's situation, define your reliability goals, narrow down the appropriate reliability tools and techniques, and then integrate these together in your reliability program.

·  The meeting will also talk about other papers of interest given at ARS.

Speaker

David Trindade, Mike Silverman, Fred Schenkelberg

Vita

·  Dr. David Trindade, Distinguished Engineer, Sun Microsystems

Formerly he was a Senior Fellow at AMD. His fields of expertise include reliability, statistical analysis, and modeling of components, systems, and software, applied statistics, especially design of experiments (DOE), and statistical process control (SPC). He is co-author (with Dr. Paul Tobias) of the book Applied Reliability, 2nd ed., published in 1995. He has authored many papers and presented at many international conferences. He has a BS in physics, an MS in material sciences and semiconductor physics, an MS in statistics, and a Ph.D. in mechanical engineering and statistics. He has been an adjunct lecturer at the University of Vermont and Santa Clara University.

·  Mike Silverman

Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. The primary set of services currently being offered are in the area of reliability. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 200 companies including Cisco, Ciena, Apple, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials. He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has 20 years of reliability, quality, and compliance experience, the majority in start-up companies. He is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 8 courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering. Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society and currently the IEEE Reliability Society Santa Clara Valley Chapter President.

·  Fred Schenkelberg

Fred is a Senior Reliability Engineering Consultant at Ops A La Carte. He is currently working with clients using reliability assessments as a starting point to develop and execute detailed reliability plans and programs. Also, he exercises his reliability engineering and statistical knowledge to design and conduct accelerated life tests. Fred joined HP in February 1996 in Vancouver, WA. He joined ESTC, Palo Alto, CA., in January 1998 and co-founded the HP Product Reliability Team. He was responsible for the community building, consulting and training aspects of the Product Reliability Program. He was also responsible for research and development on selected product reliability management topics. Prior to joining ESTC, he worked as a design for manufacturing engineer on DeskJet printers. Before HP he worked with Raychem Corporation in various positions, including research and development of accelerated life testing of polymer based heating cables. He has a Bachelors of Science in Physics from the United States Military Academy and a Masters of Science in Statistics from Stanford University. Fred is an active member of the RAMS Management Committee and currently the IEEE Reliability Society Santa Clara Valley Chapter Vice President.

 

Top

 

Date

April 27, 2005

Topic

When to use HALT and when to use ALT

Abstract

Highly Accelerated Life Testing (HALT) is a great reliability technique to use for finding predominant failure mechanisms in a product or system. However, in many cases, the predominant failure mechanism is wearout. When this is the situation, we must be able to predict or characterize this wearout mechanism to assure that it occurs outside customer expectations and outside the warranty period. The best technique to use for this is Accelerated Life Testing (ALT). In this presentation, we shall look at when to use HALT and when to use ALT. We will also look at some case studies and examples on how we can use the techniques of ALT to find and measure wearout mechanisms.

Speaker

Mike Silverman, Managing Partner, Ops A La Carte LLC

Vita

Mr. Mike Silverman is Managing Partner of Ops A La Carte LLC, a Professional Consulting Firm focused on Reliability Engineering Services, Reliability Management, and Reliability Education to assist companies in developing and executing any and all elements of Reliability throughout an Organization and their Product's Life Cycle. He has over 20 years experience in reliability engineering, reliability management and reliability training. He is an experienced leader in reliability improvement through analysis and testing. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 200 companies including Cisco, Ciena, Siemens, Intuitive Surgical, AeroGen, Abbott Labs, and Applied Materials. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches a number of courses on reliability techniques. Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is a Certified Reliability Engineer (CRE) through American Society for Quality (ASQ). Mike is a member of ASQ, IEEE, SME, ASME, PATCA, ASPMFG, and IEEE Consulting Society and is currently the IEEE Reliability Society Santa Clara Valley Chapter President.

 

Top

 

Date

March 23, 2005

Topic

How cosmic rays cause computer downtime

Abstract

Radiation initiated single bit unrepeatable data faults, soft errors, first became important to the IC manufacturers and users in the late 1970's. Soft errors in DRAM chips were traced to alpha particles emitted by the radioactive impurities in the packaging material. The result was a major change in packaging materials. Speculation that these faults could be caused by cosmic radiation proved to be premature. It was later verified that cosmic radiation also produced some soft errors. At that time, the rate was low enough that corrective action such as parity and ECC protection was needed only in very large memory arrays and systems which needed very high reliability. As the evolution of the IC business has resulted in an explosion of memory use, many more users and designers most now contend with soft error possibilities in their designs. High reliability systems must now add significantly more error prevention, detection, and correction schemes to prevent radiation cased errors from causing data errors or system malfunctions. More circuitry is showing radiation sensitivity at each new IC process node. It is expected that significant sections of logic will also need soft error protection soon.

This talk will investigate the mechanism of radiation caused soft errors in integrated circuits, the changing conditions with semiconductor materials and process advances, and the likely changes coming in the next few years.

Please find a picture 3machineinbeamand a figure ser-pnpnof Ray's talk.

Speaker

Ray Heald, Sun Microsystems

Vita

Ray Heald is a distinguished engineer and technical lead for the global SRAM design group at Sun Microsystems, Sunnyvale, Ca. He has been involved with the UltraSPARC family of processors for over 10 years, defining the circuit design starting points for RAM blocks and advising on other phases of the physical design. Prior to joining Sun, Ray designed RAM blocks and other circuitry for the Clipper family of microprocessors at Fairchild and Intergraph. Ray received the B.S., M.S., and Ph.D. degrees in electrical engineering from the University of California, Berkeley.

 

Top

 

Date

Feb. 23, 2005

Topic

Best of RAMS

Abstract

The 2005 Annual Reliability and Maintainability Symposium (RAMS) was held January 24-27 in Alexandria, Virginia. The theme of this year's RAMS is "Improving Products and Processes Through Education". Information on RAMS is available on the web at http://www.rams.org/.

The February Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from RAMS.

The panel is being organized by Alan Wood. We are looking for additional panel members, especially paper authors or RAMS attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Alan Wood

Vita

.

 

Top

 

Date

Jan. 26, 2005

Topic

Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 29th ISTFA was held November 14-18, 2004, in Worcester (Boston). Information on ISTFA is available on the web at http://www.asminternational.org/ms/electronicdevicefailureanalysissociety/istfa/home.htm. The January Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers. We are looking for additional panel members, especially ISTFA attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Panel

Vita

 

 

Top

 

Date

Oct. 27, 2004

Topic

Reliability Horror Stories

Abstract

If you liked the Exorcist (1, 2 or 3), you will love this presentation! Jurek Zarzycki will describe some of the truly ghoulish reliability horror stories he has experienced during his career at Apple and elsewhere. Then, it's audience participation time! Each participant gets 5 minutes to present their own horror stories. We will have the audience vote on the most horrific stories and award prizes. Bonus points for coming in costume. If you have a scary story you would like to share (names can be changed to protect the guilty), please RSVP by sending e-mail to reliability@ieee.org to make sure we have sufficient time for everyone. Drop in participants are welcome.

Speaker

Jurek Zarzycki

Vita

Jurek has over 25 years of experience as a reliability engineer and quality engineer with various high tech companies, including Tektronix, Varian, Diasonics, Radionics and the last 17 years with Apple Computer. He has extensive experience in high technology products and overseas quality/reliability assurance. He was trained as a design engineer in the Nuclear Technical School (Poland) and is a certified Reliability Engineer (CRE) and Quality Engineer (CQE) by the American Society for Quality and is an instructor in the areas of reliability and quality engineering.

 

Top

 

Date

Sept. 29, 2004

Topic

Are You Analyzing Reliability Data Correctly? Repairable Vs. Non-Repairable Systems: There Is a Difference

Abstract

We are all familiar with repairable systems: computers, servers, automobiles, TVs, circuit boards, production equipment, software programs, and so on. If we wanted to model the reliability behavior of such systems, we’d be surprised to discover that the reliability literature focuses mainly on models for the analysis of non-repairable systems. What are the major distinctions between repairable and non-repairable systems? When is MTBF justified for modeling? How could we get into trouble and produce misleading results by applying techniques for the analysis of non-repairable systems to data for repairable systems? In this talk we’ll illustrate some pitfalls involving MTBFs and the misapplication of techniques such as probability plotting to repairable system analysis. We’ll show some simple graphical and analytical techniques that can be very effective for the analysis and modeling of time dependent repairable system data.

Speaker

David Trindade

Vita

Dr. David Trindade is a Distinguished Engineer in the Customer Advocates for Reliability (CSCARE) Office of Sun Microsystems, Inc. His previous positions include: Senior Director of Software Quality at Phoenix Technologies, Director of Reliability, Director of Applied Statistics, and the first Senior Fellow at AMD, Worldwide Director of Quality and Reliability at General Instrument, and Advisory Engineer at IBM. His fields of expertise include: statistical analysis and modeling of software, component, and system reliability, and applied statistics, especially design of experiments (DOE) and statistical process control (SPC). He is co-author (with Dr. Paul Tobias) of the book Applied Reliability, 2nd ed., published in 1995. He has been an adjunct faculty member in the Department of Applied Mathematics at Santa Clara University. He has a BS in Physics, an MS in Statistics, an MS in Material Sciences and Semiconductor Physics, and a Ph.D. in Mechanical Engineering and Statistics.

 

Top

 

Date

June 23, 2004

Topic

Design and Analysis of Accelerated Reliability Tests

Abstract

This presentation describes piecewise linear failure rate functions and gives their reliability, infant mortality, and MTBF. The piecewise linear model resembles the left hand end of the bathtub curve, which is all that is observed of reliable product demonstration tests, even accelerated. The piecewise linear failure rate represents infant mortality and provides enough information to estimate MTBF as well as age-specific reliability during useful lifetime even with limited test time and few failures. The presentation proposes acceleration alternatives, including one that accelerates tests greatly, continuously increasing acceleration. The presentation gives experimental designs and statistical analysis of test data, assuming the piecewise linear failure rate function and power law acceleration.

Send your test data to pstlarry@comcast.net, and he'll analyze it and present the resulting model parameters, reliability, and MTBF estimates, if he can, free of charge.

Speaker

Larry George

Vita

Larry George is a Certified Reliability Engineer and Fellow of the American Society for Quality. His education includes B.S. in Engineering, M.B.A., and M.S. and Ph.D. in industrial engineering and operations research with a minor in probability and statistics from the University of California at Berkeley. He taught for 11 years; worked for 11 years at Lawrence Livermore National Laboratory; and has worked in the real world for more than 20 years. His reliability experience comes from the communications, computers, electronics, medical, power, security, space, semiconductor, sensors, and transportation sectors. He would like to thank those who inspired this presentation by their MTBF demonstration plans with too few samples, too short test times, zero failures, and LCLs on MTBF. Larry likes the challenge of learning everything useful from available data, without unwarranted assumptions.

 

Top

 

Date

May 26, 2004

Topic

Cisco's High Level Failure Analysis Process

Abstract

This presentation will review Cisco's High Level Failure Analysis Process, which describes the logistics, resources, and techniques used to provide failure information back to customers and corrective action opportunities. The failure analysis process detailing the customer return process, failure analysis process, and the data storage and reporting process used by Cisco will be discussed.

Speaker

Dennis Pachuki, Cisco

Vita

Dennis Pachucki has many years of experience in various fields of the electronics industry. These include Satellite Communications, R&D on Super Computers, Mil Spec Computer Manufacturing, New Product Introduction for Semi-Conductor Equipment, Reliability and Quality for Sun Microsystems Servers, and presently Failure Analysis for Cisco Systems Routers and Switches. He has over 10 years of experience in HALT and HASS, has conducted experiments, managed a stress-screening lab, and written papers and given presentations on these accelerated testing techniques. He has applied for 2 patents and has a BS in Industrial Engineering and Electronics Technology. He is an active committee member for the IEEE Accelerated Stress Test Organization, which organizes annual workshops and promotes stress testing techniques and data sharing.

 

Top

 

Date

April 28, 2004

Topic

A Reliability Engineer's Use of Warranty Cost Information

Abstract

This presentation discusses HP's approach to using warranty costs as a mechanism to systematically evaluate reliabilitie's influence on total product cost and outlines the challenges to establishing and maintaining a global company wide perspective and appropriate core competences in an increasingly outsourced and commoditized electronics industry.

Speaker

Fred Schenkelberg, HP

Vita

Mr. Schenkelberg joined HP in February 1996 in Vancouver, WA. He joined ESTC, Palo Alto, CA., in January 1998 and currently is a member of the Product Reliability Team. He is responsible for the community building, consulting and training aspects of the Product Reliability Program. He is also responsible for research and development on selected product reliability management topics. Prior to joining ESTC, he worked as a design for manufacturing engineer on DeskJet printers. Before HP he worked with Raychem Corporation in various positions, including research and development of accelerated life testing of polymer based heating cables. He has a Bachelors of Science in Physics from the United States Military Academy ('83) and a Masters of Science in Statistics from Stanford University ('97). He is the current chair of IEEE Reliability SCV Chapter.

 

Top

 

Date

March 31, 2004

Topic

Moving from ORT to HASA

Abstract

On-Going Reliability Testing (ORT) is currently one of the tools of choice for reliability engineers in the manufacturing phase of a products life cycle.  But how was it chosen and is it really an effective tool?  These days, most companies are finding that tools such as ORT and burn-in, once effective tools, are starting to become much more ineffective as component failure rates decrease and manufacturing variability increases.

Highly Accelerated Stress Auditing (HASA) is a better alternative for three reasons:  1) the stresses being applied have been tailored specifically for the product; 2) the stresses are accelerated enough so that information learned is timely and can affect current shipments; and 3) the audit program is being measured and adjusted to fit the current defect rate for the product.

OUTLINE

·             What is ORT

·           Why is ORT currently being used

·               Is ORT working for anyone?

·           What is HASA

·            How to move from ORT to HASA and achieve much better results

·              Success stories

Speaker

Mike Silverman

Vita

Mike is an experienced leader in reliability improvement through analysis and testing.  He has also led numerous quality system development programs.  He has 20 years of reliability and quality experience, the majority in start-up companies.  Mike is also an expert in accelerated reliability techniques, including HALT and HASS.  He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries.  Mike is founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives.  Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 100 companies including Cisco, Ciena, Siemens, Intuitive Surgical, Abbott Labs, and Applied Materials.  He has consulted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics.  Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada.  He has also developed and currently teaches 8 courses on reliability techniques.  Mike has a BS degree in Electrical and Computer Engineering from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering.  Mike is a member of ASQ, IEEE, SME, ASME, PATCA, and IEEE Consulting Society.

 

Top

 

Date

February 25, 2004

Topic

The Best of RAMS

Abstract

The 2004 Annual Reliability and Maintainability Symposium (RAMS) will be held January 26-29, in Los Angeles. The theme of this year's RAMS is the challenge of emerging technologies. Information on RAMS is available on the web at http://www.rams.org/. The February Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from RAMS. The panel is being organized by Fred Schenkelberg. We are looking for additional panel members, especially paper authors or RAMS attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Panel

Vita

 

 

Top

Date

January 28, 2004

Topic

The Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 28th ISTFA was held November 2-6, 2002, in Santa Clara. Information on ISTFA is available on the web at http://www.asm-intl.org/istfa/home.htm. The January Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers. We are looking for additional panel members, especially ISTFA attendees. If you are interested in helping select papers, being on the panel, leading a discussion, or contributing in another way, please e-mail us at reliability@ieee.org.

Speaker

Panel

Vita

 

 

Top

Date

October 29, 2003

Topic

Power Supply Reliability – an Oxymoron?

Abstract

There are many pitfalls in obtaining a power supply with good reliability. Dave Christiansen and Brooks Leman will discuss the important elements in power supply reliability and some ways to avoid the pitfalls. Topics covered include:

  • Is there such a thing as a million hour MTBF power supply?
  • How do you pick a good supplier and what do you do once you have one?
  • MTBF calculations and specification
  • Stress analysis
  • Evaluating transient conditions, e.g., power on-off cycles
  • Testing, including power supply dynamics and redundancy (if it exists)

Speaker

Dave Christiansen and Brooks Leman

Vita

David Christiansen is a member of the Reliability Engineering Department for Hewlett-Packard's NonStop Enterprise Division. He comes with 40 years experience in the computer industry, and has been with the HP reliability department since 1989. He has been instrumental in developing prediction methods, data retrieval and reporting techniques, and corrective action processes. He received his BSEE and MSEE from the University of Wisconsin, and MSQA from San Jose State University. He is a Life Member of IEEE.

Brooks Leman is a power supply engineer at Fyre Storm. He has worked in a variety of power management and engineering positions at  Rolm, Power Integrations, and Hewlett-Packard's NonStop Enterprise Division. He has over 20 years experience in power electronics in the military, server, and consumer electronics industries and has taught a graduate power electronics course. He has a BSEE and MSEE from Santa Clara. He is the Vice Chair of the SCV Chapter, IEEE Power Electronics Society.

 

Top

Date

September 24, 2003

Topic

To CRE or not CRE?

Abstract

The Certified Reliability Engineer (CRE) certification, like all other certifications offered through American Society for Quality (ASQ), is not mandatory like its Professional Engineer (PE) counterpart, and is generally not looked as highly upon as post-graduate work. Yet many engineers are finding that having "CRE" next to their name is the difference between getting the job and not. The fact is that an engineer who obtains the skills and knowledge required to pass the CRE exam truly is more valuable to a company. But is a CRE certificate difficult to obtain? Not any more!

This presentation will describe the value of the CRE certification and how to go about getting certified, including the best material to study, problems to work, and preparation methods to use. The subjects covered are: * Preparation for Exam * Material * Sample problems * Exam itself * Re-certification every 3 years

Speaker

Mike Silverman

Vita

Mike Silverman is an experienced leader in reliability improvement through analysis and testing. He has also led numerous quality system development programs and compliance programs, including Safety and EMC. He has 18 years of reliability, quality, and compliance experience, the majority in start-up companies. Mike is also an expert in accelerated reliability techniques, including HALT and HASS. He set up and ran an accelerated reliability test lab for 5 years, testing over 300 products for 100 companies in 40 different industries. Mike is co-founder and managing partner at Ops A La Carte, a Professional Business Operations Company that offers a broad array of expert services in support of new product development and production initiatives. Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 50 companies including Cisco, Ciena, Intuitive Surgical, AeroGen, and Brooks-PRI Automation. He has consu! lted in a variety of different industries including telecommunications, networking, medical, semiconductor, semiconductor equipment, consumer electronics, and defense electronics. Mike has authored and published 7 papers on reliability techniques and has presented these around the world including China, Germany, and Canada. He has also developed and currently teaches 5 courses on reliability techniques. Mike has a BS degree from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ), IEEE, Effective Training Associates, and Hobbs Engineering.

 

Top

Date

August 27, 2003

Topic

How to make a CFO care about Reliability

Abstract

CFOs don’t care about MTBFs or failure rates or even directly about quality and customer satisfaction. Their job is to worry about revenue, profit, ROI, and so forth. If you want to convince a CFO or any other senior executive to spend money on reliability improvement activities, you need to prove that the activities will save money. That is what life-cycle cost models are all about. Many companies focus on development and production costs and forget about support costs. A life-cycle cost model takes into account all product costs, including service/support costs.
cfocost

Alan Wood will discuss how to create a life-cycle cost model and demonstrate how reliability improvement activities can save money. The service portion of the model includes the cost of service personnel, inventory, repair, shipping and handling, and other miscellaneous charges, all discounted back to present value. Examples will be presented to show how to use the model and how to convert reliability into metrics such as payback period that the CFO can understand.

Speaker

Alan Wood

Vita

Alan Wood is the director of the Reliability Engineering Department for Hewlett-Packard’s NonStop Enterprise Division. His primary research interests are software reliability and computer systems availability. He has published over 30 papers on reliability, including a paper on software reliability appearing in the August, 2003 issue of IEEE Computer. He received a PhD in operations research from Stanford University and is a member of the IEEE, Informs, and the IEEE 1413 Working Group. He is the current Chairman of the IEEE Reliability SCV Chapter.

 

Top

Date

June 25, 2003

Topic

SoC Defect Testing

Abstract

The technology feature size is decreasing so rapidly that a technology process becomes obsolete before maturing.  For this rapid change, the focus in testing is more on defect coverage rather than fault coverage.   Furthermore as the ICs are becoming larger by building a whole system on a chip (SOC), the volumes of testing data as well as testing time have increased considerably. To relieve these problems, new directions in testing are used. We will review few of the latest techniques published in ITC-2002. These include: RTL-based  ATPG, Deterministic BIST, and power-sensitive test pattern generation. Both theoretical foundations and industrial practical considerations will be presented.

Speaker

Samiha Mourad, Yacoub Elziq

Vita

Prof. Samiha Mourad and Dr. Yacoub Elziq are with Santa Clara University.

 

Top

Date

May 28, 2003

Topic

Reliability of IC Packaging

Abstract

IC packaging technology has evolved rapidly in the last several years with packages getting both smaller and ever more complex. Chip scale packaging, multi chip packages, stacked packing and stacked packages along with mixtures of discrete and active devices have radically changed the packaging landscape and made the challenge of assuring reliability more demanding.
This presentation will review some of the many packaging options that have been introduced over the last few years and describe how they are meeting the reliability challenge. Included will be a look at how reliability is being redefined in terms of the product and an examination of the rational. The presentation will also briefly examine some of the challenges of pending lead-free soldering requirements which pose significant risk to both to electronic product performance and reliability, while increasing cost and potentially degrading rather than improving the environment.

Speaker

Joseph Fjelstad

Vita

Joseph Fjelstad, a co-founder of Silicon Pipe a Silicon Valley start-up company working to develop and provide unique electronic interconnection solutions to the electronics industry, has been involved in the electronics interconnection industry for over 30 years in the development of manufacturing technologies for rigid and flexible PCBs and IC packaging, holding more than 75 US patents in the field.  He has published numerous technical articles, several books on electronics manufacturing and interconnection technologies, and is a monthly columnist for Circuitree and Global SMT and Packaging magazines.

 

Top

Date

April 30, 2003

Topic

Reliability Evolution through Product Lifecycle Phases

Abstract

As a product passes through its development, prototype, and pilot lifecycle phases, and enters its production lifecycle phase, it improves in its architecture, system, material, workmanship, and focus. In turn, its reliability keeps on improving. After the product has passed through initial part of the production lifecycle phase, the company finds or feels that the product needs to be made cheaper and more attractive. Cheaper and less reliable components and processes are introduced, and specification margins are reduced. This leads to a gradual or sudden deterioration of reliability and end of life of the product. For a product to deliver more profits to the manufacturer over its life, the production lifecycle phase should stay longer and profitable. For the production lifecycle phase to stay longer and profitable, the product’s reliability should never be compromised. For maintaining the product’s reliability, inferior components or processes should not be introduced, and specification margins should not be reduced too much. Instead, the product can be made cheaper by strategic modifications in components, processes, and specifications of the product. The evolution of reliability of a product significantly contributes to the reliability of the entire industry of such products. Just as the product reliability is defined on the basis of statistics of life durations of various instances of that product, the product industry reliability is directly liked to the statistics of life durations of products within that industry. Proper understanding of reliability of a product is important for a company to market it and for a customer to use it. Similarly, proper understanding of evolution of reliability of a product through its lifecycle phases is important for an investor to invest and an entrepreneur to do the business.

Speaker

Lalit A Patel

Vita

Lalit A. Patel offers information engineering and management related services, and develops energy and vision related products, through Revela Systems. He has a broad experience of working with different organizations including Indian Institute of Technology at Delhi, Advanced Fibre Communications, Caspian Networks, and at different levels including lecturer, general manager, engineer, and advisor. He has worked in different sectors including nuclear fusion research, heavy industrial machinery, solar energy systems, and telecommunication equipment. He can visualize and resolve problems stereoscopically through technical and management perspectives. He earned his MS and PhD in physics from IIT-D and MBA from Ahmedabad.

 

Top

Date

March 26, 2003

Topic

Server Class Disk Drives: How Reliable are They?

Abstract

There is great debate over the true reliability of disk drives for server-class data storage. Manufacturers tout MTBF's of greater than 1.2 million hours. However, critical analyses of field data show some very interesting trends that often don't support the manufacturer's reliability claims. Network Appliance, has a superb system of data collection and analysis that allows them to answer some significant statistical questions that other integrators must assume. For example, does the failure rate of a given population change over time? (Do failures follow an exponential, log-normal or Weibull distribution?) Are drives produced early in the production cycle likely to be more or less reliable than those produced later? (Are later vintage drives likely to be more reliable than early vintage drives?) These all lead to the question of how drive reliability should be specified. This presentation by Jon Elerath, Network Appliance Inc, will discuss the method of data collection and analysis, distribution fits, failure rate trends, vintage analyses and specifications. You will be able to see for yourself if the manufacturers are living up to their claims.

Speaker

Jon Elerath

Vita

Jon Elerath has a BSME and MS in reliability engineering from the University of Arizona. He has 27 years experience in reliability engineering and engineering management in areas of nuclear safety systems (General Electric Co.), plasma etching equipment for semiconductor manufacturing (Tegal, a subsidiary of Motorola), fault tolerant computers (Tandem and Compaq), hard disk drives (IBM) and storage systems (Network Appliance). He is currently the manager of reliability engineering for Network Appliance Inc., a world leader in network attached storage systems. He has participated in all aspects of reliability in a commercial environment, including developing overall reliability programs, specifications, predictions, trade-off analyses, testing and data collection and analysis. Jon has chaired the Redwood Empire Section of ASQ, the Reliability Committee of the International Disk Drive Equipment Materials Association (IDEMA) and has contributed greatly to the development of the IEEE 1413 standard for reliability predictions. He has published 18 papers in the area of reliability and reliability modeling.

 

Top

Date

March 7, 2007

Topic

Best of RAMS

Abstract

The 2003 Annual Reliability and Maintainability Symposium (RAMS) was held January 27-30, in Tampa, Florida. The theme of this year's RAMS is transforming technologies for reliability and maintainability engineering. Information on RAMS is available on the web at http://www.rams.org/ . The February Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from RAMS. The panel is being organized by Fred Schenkelberg.

Speaker

Panel

Vita

n/a

 

Top

Date

January 29, 2003

Topic

Best of ISTFA

Abstract

The International Symposium for Testing and Failure Analysis (ISTFA) provides a forum for the latest developments in wafer, chip, package, and board level test and failure analysis. The 28th ISTFA was held November 3-7, 2002, in Phoenix, Arizona. Information on ISTFA is available on the web at http://www.asm-intl.org/istfa/ . The January Santa Clara Valley IEEE Reliability Society meeting will feature a panel discussion of selected papers from ISTFA. The panel is being organized by Art Rawers and Don Staab. We are looking for additional panel members, especially ISTFA attendees.

Speaker

Panel

Vita

n/a

 

Top

Date

January 28-29, 2003

Topic

Reliability Concepts and Practices

Abstract

 

Speaker

Mike Silverman

Vita

 

 

Top

Date

October 30, 2002

Topic

Accelerated Testing as Part of a Traditional Reliability Program

Abstract

In an effort to develop effective reliability programs with Highly Accelerated Life Testing (HALT) and Highly Accelerated Stress Screening (HASS) as   the cornerstone, many engineers are forgetting some of the basic building block analyses for HALT and HASS. This course offers an understanding of several key analytical tools used in conjunction with HALT Reliability Predictions, Failure Modes, Effects, and Criticality Analysis (FMECA), Fault Tree Analysis (FTA), and other tools. This course will look at how each contribute to learning about the reliability of a product and how each can help during the planning process of a HALT.

Speaker

Mike Silverman

Vita

Mike Silverman, CRE is a Managing Partner of Ops A La Carte Consulting. He is an experienced leader in quality development, reliability testing and agency qualification programs, including programs for domestic and international Safety, EMI, and Telecommunications.  He has 18 years of quality, reliability, and agency experience and 10 years experience in accelerated reliability testing techniques (HALT and HASS).   Mike also has extensive experience as a consultant to high-tech companies, and has consulted for more than 35 companies in over 10 different industry sectors. His clients range from start-up companies as well as Fortune 500 companies, including Cisco, Applied Materials, Lifescan, 3Com, Brocade, Solectron, and Schlumberger.  He has authored, published, and presented 7 papers on reliability techniques.  Mike has a BS degree from the University of Colorado at Boulder, and is both a Certified Reliability Engineer and a course instructor through the American Society for Quality (ASQ) and the Institute of Electrical and Electronics Engineers (IEEE).

 

Top

Date

September 25, 2002

Topic

Accelerated Life Testing in Micro- and Opto-Electronics: Its Objectives, Role, Attributes, Challenges, Pitfalls, Predictive Models, and Interaction with Qualification Tests

Abstract

Accelerated life tests (ALTs) are aimed at the revealing and understanding the physics of the expected or occurred failures, as well as at the accumulation of representative failure statistics. Hence, ALTs deal with both physics and statistics of failure. Adequately designed, carefully conducted, and properly interpreted ALTs provide a consistent basis for obtaining the ultimate information of the reliability of a product – the probability of failure after the given time of operation under the given conditions. ALTs can dramatically facilitate the solutions to the problems of cost effectiveness and time-to-market. These tests can help a manufacturer to make his device a product. They should play, therefore, an important role in the evaluation, prediction and assurance of the reliability of micro- and opto-electronics components, devices and systems. In the majority of cases, ALTs should be conducted in addition to the qualification tests required by the existing standards. There might be also situations, when ALTs can be used as an effective substitution for qualification tests. Whenever possible, ALTs should be used as a consistent basis for the improvement of the existing qualification specifications. We discuss the objectives of the ALTs; their role and attributes; challenges associated with the implementation and use of the ALTs; potential pitfalls due possible "shifts" in the mechanisms of failure as a result of, ALTs; and various predictive models used to design and interpret the ALT data. Particular attention is given to the interaction of the ALTs with the qualification tests.

Speaker

E Suhir

Vita

Dr. E.Suhir was Distinguished Member of Technical (Research) Staff, Bell Laboratories, Physical Sciences and Engineering Research Division (1984-2001) and Visiting Professor with the Hong Kong University of Science and Technology (2001-2002). He is Adjunct Professor, University of Illinois at Chicago, and University of Maryland. Dr. Suhir is Fellow of the IEEE, ASME and the SPE (Society of Plastics Engineers). He has authored about 250 technical publications (papers, book chapters, books, patents), including monographs: "Structural Analysis of Microelectronic and Fiber Optic Systems" (Van-Nostrand, 1991), and "Applied Probability for Engineers and Scientists" (McGraw-Hill, 1997). Dr. Suhir received many professional awards, including: 2001 IMAPS John A. Wagnon Technical Achievement Award, 2000 IEEE-CPMT Outstanding Sustained Technical Contribution Award, 2000 International SPE Fred O. Conley Award , and 1999 ASME and Pi-Tau-Sigma Charles Russ Richards Memorial Award. Dr. Suhir presented numerous invited and keynote talks at professional conferences and taught many professional development courses on various topics of materials, reliability and mechanical problems in microelectronics and photonics.

 

Top

Date

August 28, 2002

Topic

Real-World Software Reliability Overview

Abstract

What is software reliability?  Why is software reliability worth paying attention to?  What are some common techniques and typical metrics tracked to ensure software reliability at Hewlett-Packard (HP)?  What are some of the quality processes and models used at HP to build in software reliability along the way?    These and other questions will be answered in this down-to-earth look at some of the software industry's basic techniques of dealing with software reliability.   If you like technical jargon & ivory tower viewpoints, this presentation is not for you.

Speaker

Alan Padula

Vita

Alan Padula is a Senior Technology & Business  Process Consultant at Hewlett-Packard (HP).   He has spent his entire career of 25 years at HP. His past 10 years have been spent consulting with R&D Labs on software (SW) technology and process improvement in a variety of areas.  Some of the areas he has directly consulted with clients on include: SW Quality & Testing, Engineering Management, General SW Engineering. The first 15 years of his career was spent in the R&D Labs of various Product Generation Divisions in various roles as a SW Project  Manager, Program Manager, Project lead, and Engineer.  Products developed include HP SRC (revision control system), HP TOOLSET (Development Environment), HP BROWSE, HP SEARCH, & RAPID/3000 (a fourth-generation programming language development system). Alan has written technical papers & presented at various technical conferences in Montreal, Brussels, Boston, Nice, Capri, La Jolla, Lawrence Livermore Labs, and San Francisco.  Alan graduated from Lousiana State University with a MS in Computer Science, a BS in Physics, & a BS in Astronomy.

 

Top

Date

July 31, 2002

Topic

Monitoring IC Degradation Internally

Abstract

During reliability testing, internal investigations such as FIB cross-ections, are generally carried out after the IC device has failed. Localization may involve emission microscopy and/or internal probing. Each failure necessitates fault localization and root-cause determination, and this will continue to be included in the reliability process. Several techniques, however, now enable pre-failure monitoring and, potentially, improved failure prediction. One has been successfully applied to ESD phenomena with the goal of monitoring a voltage/current pulse as it propagates into the device, thus allowing "weak links" in the protection circuitry to be identified.  Another technique (PICA) utilizes the same hot carrier phenomenon that is responsible for many IC reliability issues.  Besides damaging the gate oxide, hot carriers also produce luminescence.  Although the probability of luminescence in silicon is low, this luminescence has been used for the analysis of transistor! timing within the IC using PICA. This luminescence can also potentially identify design/process issues, which affect reliability.  Two examples will be given.  The presentation will discuss these monitoring techniques and the possible applications to reliability testing.

Speaker

Ted Lundquist

Vita

Ted Lundquist has degrees in physics:  BS from MIT, thesis on X-ray scattering and detection with proportional counters; MS from U of MA, paper on laser scattering from vacancies in solids; PhD from U of MD., thesis on the ionization and neutralization of sputtered atoms. Ted has >25 years experience in ion optics and instrumentation: 6 years at NASA developing beam techniques to calibrate mass spectrometers; 15 years with Gatan, Development Manager for Ion Optics, developing ion beam systems including FIB systems. In 1994 Ted moved to Schlumberger Probe Systems, where he is currently Market Manager. Ted is passionate about providing productive solutions to customer problems.

 

Top

colorbar.gif (4491 bytes)

 

 

Last Modified: 11/11/2009