Algebraic.mle

Below you will find pages that utilize the taxonomy term “Algebraic.mle”

Masked Failure Data: Looking Back, Looking Forward

February 18, 2026

I have been working on the same statistical problem since 2020. I am now a PhD student in CS. The problem has not changed, but my understanding of it has, and the tools I have built around it look nothing like what I started with.

The problem: a series system fails when any component fails. You observe system-level failure times. But you often cannot tell which component caused the failure (masking). Some systems are still running when testing ends (censoring). Given this incomplete data, estimate component reliability.

This is not a tutorial. It is a map of where things stand and where they are going.

Observation Functors: Composable Censoring for Series System Simulation

February 13, 2026

Last week I announced maskedcauses, the R package for estimating component reliability from masked series system failures. That post covered the three likelihood models and the path to CRAN.

This post is about what happened next: the package now supports four observation types (exact, right-censored, left-censored, and interval-censored) via composable observation functors. Along the way, I wrote four vignettes, removed the md.tools dependency, and developed a verification methodology for keeping prose honest about simulation results.

maskedcauses: Maximum Likelihood Estimation for Masked Series System Failures

February 5, 2026

Note (February 2026): This package has been renamed from likelihood.model.series.md to maskedcauses.

Two days ago, I submitted likelihood.model to CRAN, the foundation package for composable statistical inference. Next in line: maskedcauses, which implements maximum likelihood estimation for series systems where component failure causes are masked.

This package is the practical result of my master’s thesis work. Three years of theoretical development, now packaged for anyone analyzing masked failure data.

The Problem: Masked Component Failures

A series system fails when any of its $m$ components fails. In reliability testing, you observe the system fail at time $t$, but two layers of uncertainty obscure the full picture:

Right-censoring: Some systems are still running when testing ends. You know they survived at least until time $\tau$, but not how much longer they would have lasted.
Masked cause of failure: When a system fails, you often can’t identify which component caused it. Diagnostic tests might narrow it down to a candidate set of possible causes, but the true failure component remains ambiguous.

This happens constantly in practice. Electronic systems fail with only board-level diagnostics. Industrial machinery fails without root-cause teardown. Medical devices fail with symptoms pointing to multiple possible subsystems.

The question: given this incomplete information, can you still estimate the lifetime distribution of each component?

The Package: Three Likelihood Models

maskedcauses provides three models with different complexity-accuracy tradeoffs:

Model	Parameters	Use Case
`exp_series_md_c1_c2_c3`	$m$ rates $(\lambda_1, \ldots, \lambda_m)$	Memoryless components (constant failure rate)
`wei_series_md_c1_c2_c3`	$2m$ params $(k_1, \beta_1, \ldots, k_m, \beta_m)$	Weibull with per-component shapes
`wei_series_homogeneous_md_c1_c2_c3`	$m+1$ params $(k, \beta_1, \ldots, \beta_m)$	Weibull with shared shape parameter

Each model implements the full inference stack: loglik(), score(), hess_loglik(), rdata(), and assumptions().

The C1-C2-C3 Conditions

The models assume three conditions that simplify the likelihood:

C1: The failed component is in the candidate set with probability 1
C2: Given the failed component is in the candidate set, masking probability is uniform across candidates
C3: Masking probabilities are independent of system parameters $\theta$

Under these conditions, the masking mechanism factors out of the likelihood. You can estimate component parameters without modeling the diagnostic process itself. That’s why the package name includes “c1_c2_c3”.

compositional.mle: SICP-Inspired Optimization

December 17, 2025

I recently updated compositional.mle, an R package for maximum likelihood estimation built on a simple premise: optimization strategies should compose.

The Problem

Most optimization libraries treat solvers as monolithic procedures. You call optim(), pass some options, hope for the best. Want to try multiple methods? Write a loop. Want coarse-to-fine optimization? Manually wire one solver’s output into the next.

compositional.mle treats solvers the way SICP treats procedures: as first-class citizens.

Primitive solvers: gradient_ascent(), newton_raphson(), bfgs(), nelder_mead()
Composition operators: %>>% (sequential chaining), %|% (parallel racing), with_restarts()
Closure: Combining solvers yields a solver

That last point is the whole thing. When you chain two solvers together, the result is itself a solver with the same interface. So compositions can be further composed, stored in variables, passed to functions, used anywhere a solver is expected.

What This Looks Like

Define your problem once:

problem <- mle_problem(
  loglike = function(theta) {
    if (theta[2] <= 0) return(-Inf)
    sum(dnorm(x, theta[1], theta[2], log = TRUE))
  },
  score = function(theta) {
    mu <- theta[1]; sigma <- theta[2]; n <- length(x)
    c(sum(x - mu) / sigma^2,
      -n / sigma + sum((x - mu)^2) / sigma^3)
  }
)

Then compose strategies declaratively:

# Global search -> local refinement -> final polish
strategy <- grid_search(lower = c(-10, 0.5), upper = c(10, 5), n = 5) %>>%
  gradient_ascent(max_iter = 50) %>>%
  newton_raphson(max_iter = 20)

result <- strategy(problem, theta0 = c(0, 1))

Or race multiple approaches:

# Try all methods, keep the best
strategy <- gradient_ascent() %|% bfgs() %|% nelder_mead()

Or handle multimodal landscapes:

# Random restarts to escape local optima
strategy <- with_restarts(gradient_ascent(), n = 10,
                          sampler = uniform_sampler(lower, upper))

The SICP Connection

This design applies SICP’s framework directly:

Primitives. The base solvers are building blocks with clear contracts. gradient_ascent() returns a solver using steepest ascent. nelder_mead() returns a derivative-free simplex solver.

Means of Combination. The operators %>>%, %|%, and with_restarts() combine solvers into new solvers. Chaining feeds one solver’s output as input to the next. Racing runs solvers in parallel and picks the winner.

Abstraction. Solver factories hide implementation details behind a consistent interface. You work with the solver abstraction, not specific algorithms.

Closure. Because composition produces objects of the same type as the inputs, the language of solvers is closed under composition. You build arbitrarily complex strategies from simple parts.

Relationship to algebraic.mle

This package complements algebraic.mle, which provides algebraic operations on MLE results. Where algebraic.mle lets you compose likelihood functions and manipulate fitted models, compositional.mle focuses on the process of finding those estimates.

They work together:

# compositional.mle: find the estimate
result <- strategy(problem, theta0)

# algebraic.mle: work with the fitted model
confint(result)
coef(result)

Try It

Install from GitHub:

likelihood.model: Composable Likelihood Models in R

June 30, 2022

Most R packages hardcode specific likelihood models. likelihood.model takes a different approach. Likelihoods are first-class objects that compose, and the framework is generic enough to work with any distribution.

The Interface

A likelihood model is anything implementing these generic methods:

loglik(model, data, params) – log-likelihood
score(model, data, params) – score function (gradient)
hessian(model, data, params) – observed information matrix

That is the interface. If your model implements these three methods, it plugs into the entire MLE stack: optimization, confidence intervals, hypothesis testing, model selection. You do not couple to specific distributions.

Likelihood Contributions

The key class is likelihood_contr_model, a likelihood built from independent contributions:

# Different observation types get different likelihood contributions
model <- likelihood_contr_model(
  exact = normal_contrib(),
  right_censored = censored_contrib()
)

This handles heterogeneous data in a unified framework. You can mix exact observations, right-censored observations, truncated observations, and different distribution families within one model. Each observation type gets its own likelihood contribution, and they combine additively in log-space.

Why This Design

The i.i.d. assumption decomposes a joint likelihood into additive log-likelihood contributions. That is how MLE actually works. likelihood.model makes this decomposition explicit and compositional.

Likelihood models are objects you manipulate, not function calls buried inside a fitting routine. You can build complex models from simple, independent pieces. You can swap in different contribution types without rewriting the rest of your code. And because the interface is generic, it works with algebraic.mle for fitting, hypothesize for testing, and any optimization backend that speaks the same protocol.

This is the same compositional philosophy as my thesis work on masked failure data. Series systems with masked causes have multiple observation types (masked vs. unmasked, different candidate sets) that each contribute differently to the likelihood. likelihood.model handles that naturally.

R package – MIT licensed – Documentation – GitHub

algebraic.mle: MLEs as Algebraic Objects

May 15, 2021

Maximum likelihood estimators have rich mathematical structure. They are consistent, asymptotically normal, efficient. algebraic.mle exposes this structure through an algebra where MLEs are objects you compose, transform, and query.

The Abstraction

An MLE is not just a vector of parameter estimates. It is a statistical object that carries point estimates $\hat{\theta}$, the Fisher information matrix $I(\hat{\theta})$, the variance-covariance matrix $I^{-1}(\hat{\theta})$, Wald-type confidence intervals from asymptotic normality, the log-likelihood value, and convergence diagnostics.

The package wraps all of this in a consistent interface:

library(algebraic.mle)

fit <- mle(likelihood_model, data)
coef(fit)           # Parameter estimates
vcov(fit)           # Variance-covariance matrix
confint(fit)        # Confidence intervals
logLik(fit)         # Log-likelihood
aic(fit)            # Model selection

Composition

The real point is that MLEs compose. Independent models combine:

fit1 <- mle(model1, data1)
fit2 <- mle(model2, data2)
combined <- fit1 + fit2  # Joint likelihood

The package handles the algebra. Joint log-likelihood, block-diagonal Fisher information, everything propagates correctly. This works because likelihoods from independent data sources multiply, and multiplication of likelihoods is addition of log-likelihoods. That is a monoid. The package enforces it.

The Ecosystem

algebraic.mle is the foundation for a family of packages:

Package	Purpose
likelihood.model	Compositional likelihood specification
maskedcauses	Masked failure data in series systems
mdrelax	Relaxed masking conditions
algebraic.dist	Distributions as algebraic objects
flexhaz	Dynamic failure rate distributions
hypothesize	Likelihood ratio tests on MLEs
numerical.mle	Numerical optimization backends

The typical workflow:

Define distributions with algebraic.dist
Specify likelihood contributions with likelihood.model
Fit the model and get an mle object from algebraic.mle
Query statistical properties: confidence intervals, hypothesis tests, model selection

For series systems with masked data:

library(maskedcauses)
library(algebraic.mle)

# Specify masking model (C1-C2-C3 conditions)
model <- md_likelihood_model(components = 3, masking = "bernoulli")

# Fit -> returns algebraic.mle object
fit <- md_mle_exp_series_C1_C2_C3(masked_data)

# All the standard MLE methods work
confint(fit)
vcov(fit)
aic(fit)

Theory

The asymptotic properties that algebraic.mle exploits come from classical MLE theory:

$$\sqrt{n}(\hat{\theta}_n - \theta^{\ast}) \xrightarrow{d} \mathcal{N}(0, I^{-1}(\theta^{\ast}))$$

The expo-masked-fim paper derives closed-form Fisher information for exponential series systems. That is exactly what algebraic.mle uses internally for variance estimation in that case.

For more complex models (Weibull, relaxed masking conditions), we compute Fisher information numerically via observed information:

$$\hat{I}(\hat{\theta}) = -\frac{\partial^2 \ell}{\partial \theta \partial \theta^T}\bigg|_{\theta=\hat{\theta}}$$

Design Principles

Separation of concerns. The likelihood specification (likelihood.model) is independent of the fitting algorithm (numerical.mle) and the result type (algebraic.mle). You can swap optimizers without changing downstream code.

Model	Parameters	Use Case
`exp_series_md_c1_c2_c3`	\(m\) rates \((\lambda_1, \ldots, \lambda_m)\)	Memoryless components (constant failure rate)
`wei_series_md_c1_c2_c3`	\(2m\) params \((k_1, \beta_1, \ldots, k_m, \beta_m)\)	Weibull with per-component shapes
`wei_series_homogeneous_md_c1_c2_c3`	\(m+1\) params \((k, \beta_1, \ldots, \beta_m)\)	Weibull with shared shape parameter