Thursday, March 15, 2012: 4:30 PM-5:30 PM
International Ballroom F (Omni Hotel CNN Center)
Speakers:
Robert Abrams(IBM Corporation)
and
Sam Knutson(GEICO)
The presenter will discuss the multiple capabilities which are available on z/OS to detect and diagnose soft failures
- Describe soft failure detection
- Built into z/OS component like XCF stalled member detection
- Provided by health checks
- Provided by z/OS PFA
- Provided by other vendor products
- Highlight the kind of problems each different type of soft failure detection is good at and not good at
- Machine time scale vs human time scale
- Location in the stack
- Detectable by performance metrics vs non performance metrics
- Insight from building PFA to help reduce impact of soft failures
- Automation of alerts is key
- z/OS can survive / recover from most soft failures
- Most metrics are very time sensitive
Tracks: z/OS Systems Programming and zNextGen