One of the most interesting statistical problems I’ve encountered is reliability analysis with censored data—situations where you know something didn’t fail, but not when it will fail.
The Censoring Problem
Imagine testing light bulbs. You run them for 1000 hours. Some fail during the test. Others are still working when you stop.
For the survivors, you know:
- They lasted at least 1000 hours
- You don’t know their actual lifetime
This is right censoring. The true value lies somewhere to the right of your observation.
Why This Matters
Censored data is everywhere:
- Medical studies (patients still alive at study end)
- Engineering tests (components that haven’t failed)
- Customer retention (users still active)
Ignoring censored observations wastes information. Treating them as failures introduces bias.
Maximum Likelihood to the Rescue
The elegant solution is maximum likelihood estimation with likelihood contributions that account for censoring:
- Failure observations: Contribute the probability density
- Censored observations: Contribute the survival probability
This lets you extract information from both failed and surviving units.
Series Systems Complexity
It gets more interesting with series systems—systems that fail when any component fails. If you observe system failure but don’t know which component caused it, you have masked failure data.
This is the problem I’m most interested in: extracting component-level reliability from system-level failures when the cause is ambiguous.
This work is laying groundwork for what will become a major focus of my mathematical statistics degree.
Discussion