Skip to main content

Masked Failure Data: Looking Back, Looking Forward

I started this work in 2020, during the math master’s. Now it is 2026. I am a PhD student in CS, still working on the same problem, and the shape of the project has changed enough that it is worth stepping back and looking at where things stand.

The problem, briefly: a series system fails when any component fails, but you often cannot tell which component caused the failure. The failure cause is “masked.” You may also have censored observations (systems still running when testing ends). Given this incomplete data, estimate the reliability of individual components.

This post is not a tutorial. It is a map of what I have built, what I have learned, and what I am working on next.


The Arc

2020-2023: The Master’s Thesis

I picked the topic because it combined everything I wanted to learn: survival analysis, mixture models, the EM algorithm, bootstrap inference, simulation studies. My thesis focused on Weibull series systems with maximum likelihood estimation. The implementation was monolithic: one R package (wei.series.md.c1.c2.c3) that did everything.

The thesis was defensible. The code was not reusable. Everything was tangled together: the distribution algebra, the MLE infrastructure, the likelihood models, the series system logic, the masking conditions. If you wanted to change the distribution family, you rewrote half the package.

2023-2024: Algebraic Decomposition

After finishing the master’s, I started pulling the monolith apart. The question that drove the refactoring was: what are the actual algebraic structures here?

The answer turned out to be several layers:

  1. Distributions form an algebra. You can add, scale, and compose them. That became algebraic.dist, now on CRAN.

  2. MLEs form an algebra. Delta method for transformations, bootstrap for inference, reparameterization maps. That became algebraic.mle, also on CRAN.

  3. Likelihood models are composable. A likelihood for heterogeneous data is a sum of likelihood contributions, each from a different observation type. That became likelihood.model, now on CRAN.

  4. MLE solvers are composable. You can chain solvers sequentially, race them in parallel, add random restarts. The SICP closure property applies: the output of a solver composition is itself a solver. That became compositional.mle.

  5. Hazard functions define distributions. If you can write h(t;θ)h(t; \theta), you get S(t)S(t), F(t)F(t), f(t)f(t), quantiles, sampling, and MLE automatically. That became flexhaz. A companion package, serieshaz, composes component hazards into series system distributions.

Each layer depends only on the layers below it. The dependency graph is clean:

                algebraic.dist
                 |          \
             algebraic.mle    \
              |        \       \
compositional.mle   likelihood.model
                      |         \
                   flexhaz    maskedcauses
                     |
                  serieshaz
                     |
                  maskedhaz

This decomposition was not planned from the start. It emerged from trying to answer “why is this code so hard to change?” repeatedly until the natural joints appeared.

2024: Automatic Differentiation and Computational Thinking

The master’s defense included BCa bootstrap confidence intervals for reliability estimation. Preparing that forced me to explain the MLE theory clearly and exposed the places where my understanding was mechanical rather than structural.

After the defense, I experimented with PyTorch’s autograd as a computational backend for MLE: define the log-likelihood as a differentiable function, let AD compute scores and Hessians. I was already familiar with the relationship between gradients and score functions, but the experiment solidified it. The score is the gradient of the log-likelihood. The observed information is the negative Hessian. These are not separate concepts; they are the same computation viewed from different angles.

That solidification led me to build nabla, an R package for exact automatic differentiation via nested dual numbers. nabla computes derivatives of arbitrary order at machine precision: D(f) gives gradients, D(D(f)) gives Hessians, and it works through loops and branches. It is not fast enough to use inside an optimizer (that is what numDeriv or analytical derivatives are for), but it is the right tool for final analysis: observed Fisher information, skewness of an MLE, higher-order diagnostics. You find your MLE however you want, then use nabla to characterize the solution precisely.

2025: The Foundation Paper and FIM Theory

In 2025, I wrote the paper that makes the general framework explicit.

The master’s thesis was already distribution-agnostic in its likelihood derivation. I just instantiated it with Weibull components and right censoring. But the general structure was buried inside a thesis that was mostly about one specific case. The foundation paper pulls that general framework out, gives it a full formal treatment, and extends it to handle all four observation types (exact, right-censored, left-censored, interval-censored). It defines three conditions (C1, C2, C3) on the masking mechanism:

  • C1: The true failed component is always in the candidate set.
  • C2: Masking probabilities are symmetric across components.
  • C3: Masking probabilities do not depend on the system parameters.

Under these conditions, the likelihood factors cleanly. The paper derives the general likelihood, score equations, and Fisher information, then shows how to instantiate them for exponential, Weibull, Pareto, log-normal, and gamma families.

The exponential companion paper takes the simplest case (exponential lifetimes, uniform masking) and pushes it as far as the math allows. Closed-form MLE. Analytical Fisher information matrix. A proof that information loss from masking is monotone: as candidate sets grow, you lose information about the failure cause, and the loss is strictly increasing. A characterization of uniform masking as the maximum-entropy masking model under C2.

That last result surprised me. Uniform masking (every non-failed component is equally likely to appear in the candidate set) is the worst case for identifiability among all C2-compliant masking models. It maximizes the entropy of the candidate set given the failure cause. So the closed-form results in the exponential paper are not just tractable special cases; they are pessimistic bounds.


Where Things Stand Now

Software

The R package ecosystem has 11 packages. Six are on CRAN (algebraic.dist, algebraic.mle, likelihood.model, compositional.mle, hypothesize, nabla). The rest are targeting CRAN submission over the next few months:

PackageWhat It DoesStatus
algebraic.distDistribution algebraOn CRAN
algebraic.mleMLE algebraOn CRAN
likelihood.modelFisherian likelihood frameworkOn CRAN
compositional.mleComposable MLE solversOn CRAN
flexhazDistributions from hazard functionsTargeting CRAN + JOSS
serieshazSeries system distributionsTargeting CRAN + JOSS
maskedcausesAnalytical MLE for masked seriesTargeting CRAN + JOSS
maskedhazNumerical MLE for masked seriesTargeting CRAN + JOSS
mdrelaxRelaxed masking conditionsPaper first; package if it pans out
nablaAutomatic differentiation (dual numbers)On CRAN
hypothesizeHypothesis testing frameworkOn CRAN

The maskedcauses and maskedhaz packages solve the same problem at different levels of generality. maskedcauses has closed-form solutions for exponential and Weibull components. maskedhaz works with arbitrary hazard functions via numerical integration. When both are installed, the test suites cross-validate against each other.

mdrelax explores what happens when you relax the C1, C2, and C3 conditions: informative masking, parameter-dependent masking, masking probabilities less than one. Right now it is research code for a paper on relaxed masking conditions. If the results hold up, a proper R package may come out of it.

Papers

PaperStatus
Foundation (C1-C2-C3 framework)Draft complete
Exponential companion (closed-form FIM)Draft complete
Model selection (LRT nesting chain)Draft complete, software in maskedcauses vignette
Relaxed C1/C2/C3 conditionsDraft in progress
Master’s thesis (original Weibull treatment)Published

What Comes Next

I have five companion paper directions planned. They vary in maturity from “active research” to “one-page idea.”

Identifiability and Diagnostic Design (Active)

The foundation paper has a theorem that gives necessary and sufficient conditions for parameter identifiability from masked data. But it is a binary result: identifiable or not. It says nothing about how much diagnostic separation you need for practical (finite-sample) identifiability, which candidate set structures are most informative, or how identifiability degrades as masking increases.

This paper will give a graph-theoretic characterization (components are identifiable iff they are not always diagnostically confounded), a linear algebra condition for the exponential case (rank of the candidate-set matrix), and simulation ablation studies varying masking probability, number of components, candidate set structure, and sample size.

The connection to the exponential companion is direct: uniform masking at window size w=m1w = m-1 is the most pessimistic identifiable scenario (maximum entropy). So the question becomes: what masking design minimizes information loss? This is a D-optimality problem for the masking mechanism.

Observation Scheme Composition (Idea Stage)

The maskedcauses package implements composable observation functors: observe_right_censor(), observe_left_censor(), observe_periodic(), observe_mixture(). The mixture functor randomly assigns each unit to one of several monitoring schemes, modeling heterogeneous testing environments.

The mathematical content is that C1-C2-C3 are preserved under these compositions. If your masking mechanism satisfies the conditions, and you compose it with an independent censoring scheme, the composition still satisfies the conditions. This means practitioners can mix observation protocols without re-deriving the likelihood.

The proofs are straightforward. The interesting question is whether there is a category-theoretic formulation: observation schemes as morphisms in a category of data-reduction maps, with composition as functor composition. If so, the closure theorems become instances of a general principle.

Statistical Parsimony vs Physical Structure (Idea Stage)

A series system has components. You know this from engineering. But if the data are heavily masked (most candidate sets are {1,,m}\{1, \ldots, m\}) and the sample is small, a single Weibull distribution may fit the system lifetime data just as well as an mm-component series model.

Standard model selection says: use the simpler model. Engineering knowledge says: the components exist.

This is a genuine tension. It is not about which test to run. It is about what the model is for. If you want to predict system lifetimes, the single Weibull may suffice. If you want to estimate component reliability for maintenance planning, the series decomposition is essential even if over-parameterized relative to the data.

I think this is a short paper, or maybe just a well-argued essay with supporting simulations.

Weibull Companion (Planned)

The exponential companion gives closed-form results because exponential lifetimes are memoryless and the hazard is constant. The Weibull case has shape parameters, which means time-varying hazards, time-dependent cause probabilities (in the heterogeneous case), and numerical integration for some censoring types.

The homogeneous Weibull model (shared shape, m+1m+1 parameters) is cleaner: the system is itself Weibull, cause probabilities are time-invariant, and the censored likelihood contributions have closed-form weights. The heterogeneous case (2m2m parameters) requires more numerical machinery.

The master’s thesis already covers Weibull estimation, but the companion paper would provide the full treatment in the foundation paper’s notation: score equations, Fisher information, simulation studies, and practical guidance for choosing between homogeneous and heterogeneous models.

This is the highest-effort companion. The theory exists in pieces across the thesis and the maskedcauses package, but writing it up properly as a standalone paper is substantial work.

CRAN Submissions

The more immediate work is getting the remaining R packages onto CRAN. The pipeline:

  1. flexhaz (reliability chain; serieshaz follows once it matures)
  2. maskedcauses, maskedhaz (depend on everything above)
  3. mdrelax (paper first; an R package may follow if the relaxed conditions prove useful in practice)

Each submission involves the usual CRAN gauntlet: R CMD check with zero warnings, documentation, vignettes, examples, and the review process. I have been through it six times now, so I know the drill. It is tedious but not mysterious.


What I Have Learned

Three things, mostly.

Decompose first, optimize later. The monolithic thesis code worked but could not grow. Pulling it into algebraic layers took months, but every piece became independently testable, reusable, and publishable. The decomposition is the research contribution, not just an engineering convenience.

Publish the general theory as its own paper. The thesis had the distribution-agnostic likelihood derivation, but it was embedded in a document mostly about Weibull. The general framework deserved its own treatment. Writing the foundation paper forced me to separate what is structural (the C1-C2-C3 factorization, the observation type taxonomy) from what is specific to a distribution family. That separation made the companion papers possible.

The Fisher information matrix is the right lens. For two years, I treated FIM as “the thing you invert to get standard errors.” The exponential companion paper forced me to understand it as a measure of the information content of the observation scheme. Once I saw that, the identifiability results, the information loss monotonicity, and the optimal design questions all fell out naturally. The FIM connects the statistical theory to the practical question of how to design better diagnostics.


The work is not close to done. But the foundation is solid: a clean theoretical framework, a modular software stack, two completed papers, and a clear map of what comes next. Most days, that is enough to keep going.

Discussion