
![]()

Home | Events | Directions | Archives | Education | Links | Jobs| Officers
![]()
Reliability Chapter Events
|
Day |
Type |
Subject |
Speaker |
Cost |
Time |
Place |
|
|
May 28, 2008 |
Wednesday |
Seminar |
Electronic Prognostics (EP) (Abstract) |
Dr. Kenny Gross |
Free |
6:30 PM-refreshments; 7:00 PM-presentation; 8:30 PM - Q&A |
![]()
|
Date |
Subject |
Cost |
Place |
|
June 4-6, 2008 |
$75 to $275 |
|
|
|
June 17-20, 2008 |
$1000 to $1200 |
|
|
|
June 24-27, 2008 |
$195 to $815 |
Anchorage, AK (Alaska) |
|
|
July 7-9, 2008 |
Call |
|
|
|
Aug 3-7, 2008 |
$75 to $415 |
|
|
|
Oct 1-3, 2008 |
Not yet avail. |
|
|
|
Oct 12-16, 2008 |
IRW (International Reliability Workshop -- AKA Wafer Level Reliability Workshop) |
Not yet avail. |
Fallen |
![]()
|
May 28, 2008 |
|
|
Topic |
Electronic Prognostics (EP) |
|
Abstract |
In today's world of eCommerce, down time for enterprise servers in business-critical datacenters costs millions of dollars per hour. The System Dynamics, Characterization and Control group at Sun Microsystems has pioneered new proactive fault monitoring innovations for enhancing the reliability, availability, and serviceability of computer servers. The key enabler for Electronic Prognostics is a patented continuous system telemetry harness (CSTH), implemented in software, which collects time series signals relating to the health of dynamically executing servers and their components, network interconnects, and peripherals. These time series provide quantitative metrics associated with physical variables (distributed temperatures, voltages, and currents throughout the system), performance variables (loads, throughputs, queue lengths, etc.), and various quality-of-service (QOS) metrics. The CSTH signals are continuously archived to an offline circular file (i.e. the "Black Box Flight Recorder"), and are also processed in real time using advanced pattern recognition for proactive anomaly detection. The pattern recognition provides sensitive early detection of a variety of mechanisms that are known to cause downtime in enterprise datacenters, including: Environmental issues (thermal anomalies, air-flow restrictions, degraded fan motors); Software aging phenomena (memory leaks, resource contention); Degraded/failed sensors; Degradation of power supplies, capacitors, and interconnects; and "inferential sensing" capability (wherein a failed sensor is replaced with a highly accurate analytical estimate).
Sun Microsystems' CSTH coupled with advanced pattern recognition techniques adapted from the commercial nuclear and aerospace industries are helping to increase component reliability margins, system availability goals, and optimal energy utilization for enterprise computing datacenters. |
|
Speaker |
Kenny Gross received his Ph.D. in nuclear engineering
from the is a Distinguished Engineer for Sun Microsystems and is
team leader for the System Dynamics Characterization and Control team in Sun's Physical
Sciences Research Center in Kenny specializes in advanced pattern recognition,
continuous system telemetry, and dynamical system characterization for improving the reliability,
availability, and serviceability of enterprise computing systems. Kenny has 194 publications, and was awarded a 1998 R&D 100 Award
for one of the top 100 technological innovations of that year, for an advanced statistical
pattern recognition technique (MSET) that was originally developed for nuclear and aerospace
applications and is now being used for a variety of applications to improve quality, availability,
and energy efficiency for enterprise computer servers. |
|
|
IEEE | Santa Clara Valley Section | Reliability Society
About IEEE | News&Info | Links | Store | Search | Publications
Region6 | SF-BayAreaCouncil | SF-Section | Oakland-EastBaySection | GRID | WesCon
Last Modified: 04/26/2008