Skip to main content

algebraic.mle: MLEs as Algebraic Objects

Maximum likelihood estimators have rich mathematical structure. They are consistent, asymptotically normal, efficient. algebraic.mle exposes this structure through an algebra where MLEs are objects you compose, transform, and query.

The Abstraction

An MLE is not just a vector of parameter estimates. It is a statistical object that carries point estimates θ^\hat{\theta}, the Fisher information matrix I(θ^)I(\hat{\theta}), the variance-covariance matrix I1(θ^)I^{-1}(\hat{\theta}), Wald-type confidence intervals from asymptotic normality, the log-likelihood value, and convergence diagnostics.

The package wraps all of this in a consistent interface:

library(algebraic.mle)

fit <- mle(likelihood_model, data)
coef(fit)           # Parameter estimates
vcov(fit)           # Variance-covariance matrix
confint(fit)        # Confidence intervals
logLik(fit)         # Log-likelihood
aic(fit)            # Model selection

Composition

The real point is that MLEs compose. Independent models combine:

fit1 <- mle(model1, data1)
fit2 <- mle(model2, data2)
combined <- fit1 + fit2  # Joint likelihood

The package handles the algebra. Joint log-likelihood, block-diagonal Fisher information, everything propagates correctly. This works because likelihoods from independent data sources multiply, and multiplication of likelihoods is addition of log-likelihoods. That is a monoid. The package enforces it.

The Ecosystem

algebraic.mle is the foundation for a family of packages:

PackagePurpose
likelihood.modelCompositional likelihood specification
maskedcausesMasked failure data in series systems
mdrelaxRelaxed masking conditions
algebraic.distDistributions as algebraic objects
flexhazDynamic failure rate distributions
hypothesizeLikelihood ratio tests on MLEs
numerical.mleNumerical optimization backends

The typical workflow:

  1. Define distributions with algebraic.dist
  2. Specify likelihood contributions with likelihood.model
  3. Fit the model and get an mle object from algebraic.mle
  4. Query statistical properties: confidence intervals, hypothesis tests, model selection

For series systems with masked data:

library(maskedcauses)
library(algebraic.mle)

# Specify masking model (C1-C2-C3 conditions)
model <- md_likelihood_model(components = 3, masking = "bernoulli")

# Fit -> returns algebraic.mle object
fit <- md_mle_exp_series_C1_C2_C3(masked_data)

# All the standard MLE methods work
confint(fit)
vcov(fit)
aic(fit)

Theory

The asymptotic properties that algebraic.mle exploits come from classical MLE theory:

n(θ^nθ)dN(0,I1(θ))\sqrt{n}(\hat{\theta}_n - \theta^{\ast}) \xrightarrow{d} \mathcal{N}(0, I^{-1}(\theta^{\ast}))

The expo-masked-fim paper derives closed-form Fisher information for exponential series systems. That is exactly what algebraic.mle uses internally for variance estimation in that case.

For more complex models (Weibull, relaxed masking conditions), we compute Fisher information numerically via observed information:

I^(θ^)=2θθTθ=θ^\hat{I}(\hat{\theta}) = -\frac{\partial^2 \ell}{\partial \theta \partial \theta^T}\bigg|_{\theta=\hat{\theta}}

Design Principles

Separation of concerns. The likelihood specification (likelihood.model) is independent of the fitting algorithm (numerical.mle) and the result type (algebraic.mle). You can swap optimizers without changing downstream code.

Correctness by construction. Standard errors, confidence intervals, and hypothesis tests are computed from the Fisher information, not ad-hoc formulas. If your likelihood is correct, statistical inference follows automatically.

Composability. Build complex models from simpler ones. The algebra ensures properties propagate correctly.

This package directly supports the work in my master’s thesis on reliability estimation in series systems. The bootstrap confidence intervals, likelihood ratio tests, and model selection all use algebraic.mle objects.

Resources

Discussion