IEEEscvban.gif (7997 bytes)Reliability

IEEE - SCV - Reliability

Home | Events | Directions | Archives | Education | Links | Jobs| Officers

colorbar.gif (4491 bytes)

Reliability Chapter Events

Date

Day

Type

Subject

Speaker

Cost

Time

Place

May 28, 2008

Wednesday

Seminar

Electronic Prognostics (EP) (Abstract)

Dr. Kenny Gross

Free

6:30 PM-refreshments; 7:00 PM-presentation; 8:30 PM - Q&A

HP-Cupertino

colorbar.gif (4491 bytes)

Other Upcoming Events

Date

Subject

Cost

Place

June 4-6, 2008

QRPC (Quality & Productivity Research Conference) 2008

$75 to $275

Madison, WI

June 17-20, 2008

ARS (Applied Reliability Symposium)

$1000 to $1200

Reno, NV

June 24-27, 2008

DSN (Dependable Systems and Networks) 2008

$195 to $815

Anchorage, AK (Alaska)

July 7-9, 2008

IOLTS (International On-Line Testing Symposium)

Call

Greece

Aug 3-7, 2008

JSM (Joint Statistical Meetings) 2008

$75 to $415

Denver, CO

Oct 1-3, 2008

ASTR (Accelerated Stress Testing and Reliability  )

Not yet avail.

Portland, OR

Oct 12-16, 2008

IRW (International Reliability Workshop --  AKA Wafer Level Reliability Workshop)

Not yet avail.

Fallen Leaf Lake, CA

colorbar.gif (4491 bytes)

Date

May 28, 2008

Topic

Electronic Prognostics (EP)

Abstract

In today's world of eCommerce, down time for enterprise servers in business-critical 
datacenters costs millions of dollars per hour.  The System Dynamics, Characterization and 
Control group at Sun Microsystems has pioneered new proactive fault monitoring innovations 
for enhancing the reliability, availability, and serviceability of computer servers.  The key 
enabler for Electronic Prognostics is a patented continuous system telemetry harness (CSTH), 
implemented in software, which collects time series signals relating to the health of 
dynamically executing servers and their components, network interconnects, and peripherals.  
These time series provide quantitative metrics associated with physical variables (distributed 
temperatures, voltages, and currents throughout the system), performance variables (loads, 
throughputs, queue lengths, etc.), and various quality-of-service (QOS) metrics.  The CSTH 
signals are continuously archived to an offline circular file (i.e. the "Black Box Flight 
Recorder"), and are also processed in real time using advanced pattern recognition for proactive 
anomaly detection.  The pattern recognition provides sensitive early detection of a variety of 
mechanisms that are known to cause downtime in enterprise datacenters, including:
  Environmental issues (thermal anomalies, air-flow restrictions, degraded fan motors); 
  Software aging phenomena (memory leaks, resource contention); 
  Degraded/failed sensors; 
  Degradation of power supplies, capacitors, and interconnects; 
  and "inferential sensing" capability 
    (wherein a failed sensor is replaced with a highly accurate analytical estimate).  
 

Sun Microsystems' CSTH coupled with advanced pattern recognition techniques adapted from

the commercial nuclear and aerospace industries are helping to increase component reliability

margins, system availability goals, and optimal energy utilization for enterprise computing

datacenters.

Speaker

Kenny Gross received his Ph.D. in nuclear engineering from the U. of Cincinnati in 1977. Kenny

is a Distinguished Engineer for Sun Microsystems and is team leader for the System Dynamics

Characterization and Control team in Sun's Physical Sciences Research Center in San Diego. 

Kenny specializes in advanced pattern recognition, continuous system telemetry, and dynamical

system characterization for improving the reliability, availability, and serviceability of

enterprise computing systems.   Kenny has 194 US patents issued and pending, 168 scientific

publications, and was awarded a 1998 R&D 100 Award for one of the top 100 technological

innovations of that year, for an advanced statistical pattern recognition technique (MSET) that

was originally developed for nuclear and aerospace applications and is now being used for a

variety of applications to improve quality, availability, and energy efficiency for enterprise

computer servers. 

 

Top

colorbar.gif (4491 bytes)

Email - general | Email - web

IEEE | Santa Clara Valley Section | Reliability Society

About IEEE | News&Info | Links | Store | Search | Publications

Region6 | SF-BayAreaCouncil | SF-Section | Oakland-EastBaySection | GRID | WesCon

Last Modified: 04/26/2008