algebraic.mle: MLEs as Algebraic Objects

May 15, 2021

Maximum likelihood estimators have rich mathematical structure. They are consistent, asymptotically normal, efficient. algebraic.mle exposes this structure through an algebra where MLEs are objects you compose, transform, and query.

The Abstraction

An MLE is not just a vector of parameter estimates. It is a statistical object that carries point estimates \(\hat{\theta}\), the Fisher information matrix \(I(\hat{\theta})\), the variance-covariance matrix \(I^{-1}(\hat{\theta})\), Wald-type confidence intervals from asymptotic normality, the log-likelihood value, and convergence diagnostics.

The package wraps all of this in a consistent interface:

library(algebraic.mle)

fit <- mle(likelihood_model, data)
coef(fit)           # Parameter estimates
vcov(fit)           # Variance-covariance matrix
confint(fit)        # Confidence intervals
logLik(fit)         # Log-likelihood
aic(fit)            # Model selection

Composition

The real point is that MLEs compose. Independent models combine:

fit1 <- mle(model1, data1)
fit2 <- mle(model2, data2)
combined <- fit1 + fit2  # Joint likelihood

The package handles the algebra. Joint log-likelihood, block-diagonal Fisher information, everything propagates correctly. This works because likelihoods from independent data sources multiply, and multiplication of likelihoods is addition of log-likelihoods. That is a monoid. The package enforces it.

The Ecosystem

algebraic.mle is the foundation for a family of packages:

PackagePurpose
likelihood.modelCompositional likelihood specification
maskedcausesMasked failure data in series systems
mdrelaxRelaxed masking conditions
algebraic.distDistributions as algebraic objects
flexhazDynamic failure rate distributions
hypothesizeLikelihood ratio tests on MLEs
numerical.mleNumerical optimization backends

The typical workflow:

  1. Define distributions with algebraic.dist
  2. Specify likelihood contributions with likelihood.model
  3. Fit the model and get an mle object from algebraic.mle
  4. Query statistical properties: confidence intervals, hypothesis tests, model selection

For series systems with masked data:

library(maskedcauses)
library(algebraic.mle)

# Specify masking model (C1-C2-C3 conditions)
model <- md_likelihood_model(components = 3, masking = "bernoulli")

# Fit -> returns algebraic.mle object
fit <- md_mle_exp_series_C1_C2_C3(masked_data)

# All the standard MLE methods work
confint(fit)
vcov(fit)
aic(fit)

Theory

The asymptotic properties that algebraic.mle exploits come from classical MLE theory:

$$\sqrt{n}(\hat{\theta}_n - \theta^{\ast}) \xrightarrow{d} \mathcal{N}(0, I^{-1}(\theta^{\ast}))$$

The expo-masked-fim paper derives closed-form Fisher information for exponential series systems. That is exactly what algebraic.mle uses internally for variance estimation in that case.

For more complex models (Weibull, relaxed masking conditions), we compute Fisher information numerically via observed information:

$$\hat{I}(\hat{\theta}) = -\frac{\partial^2 \ell}{\partial \theta \partial \theta^T}\bigg|_{\theta=\hat{\theta}}$$

Design Principles

Separation of concerns. The likelihood specification (likelihood.model) is independent of the fitting algorithm (numerical.mle) and the result type (algebraic.mle). You can swap optimizers without changing downstream code.

Read More

algebraic.dist: Distributions as Algebraic Objects in R

February 1, 2021

Most statistical software treats probability distributions as parameter sets you pass to sampling or density functions. algebraic.dist takes a different approach. Distributions are algebraic objects that compose, transform, and combine through standard mathematical operations.

The Idea

Instead of this:

x <- rnorm(1000, mean=5, sd=2)
y <- rnorm(1000, mean=3, sd=1)
z <- x + y  # Just numeric vectors

You write:

X <- Normal(mean=5, sd=2)
Y <- Normal(mean=3, sd=1)
Z <- X + Y  # A new distribution object!
sample(Z, 1000)

The sum Z knows it is Normal(mean=8, sd=sqrt(5)) because the algebra works it out. You never lost the distributional structure.

Why It Matters

When you add two normal distributions numerically, you get samples from the sum. But you lose the distribution. With algebraic.dist, the result is still a distribution object with proper parameters and you can keep composing.

You can build complex distributional expressions and simplify them algebraically before ever drawing a sample:

portfolio <- 0.6*StockA + 0.4*StockB
risk <- sd(portfolio)  # Computed symbolically

For distributions with known closed-form algebra (normal, exponential, certain mixtures), you do not need simulation. You just compute the exact answer. Monte Carlo without the Monte Carlo.

Composition

This is functional programming applied to probability theory. Distributions become composable building blocks:

  • Mixture models: 0.3*Normal(0,1) + 0.7*Normal(5,2)
  • Transformed distributions: exp(Normal(0,1)) is lognormal
  • Conditional distributions: X | (X > 0) for truncation

The idea is that computation should mirror mathematical structure. If the math says you can add two normals and get a normal, the code should do the same thing and give you a normal back, not a vector of samples.

This connects to a broader theme in my work. Just as my oblivious computing research uses type theory to enforce privacy invariants, algebraic.dist uses algebraic types to enforce distributional invariants. The algebra tells you what operations are valid and what the results mean.

Implementation

  • Language: R
  • Type system: S3 classes with method dispatch for operations
  • Closed-form operations: Normal, exponential, gamma families
  • Fallback: Monte Carlo for compositions without closed forms
  • Repository: github.com/queelius/algebraic.dist
  • algebraic.mle: Maximum likelihood estimation with algebraic specification
  • numerical.mle: Numerical optimization for MLE when closed forms do not exist
  • likelihood.model: Likelihood-based inference with compositional model building

Most statistical software is imperative. You tell it what to do step by step. algebraic.dist is declarative. You describe the distributional relationships and the computer figures out what to compute. Small composable pieces that do one thing well: preserve distributional structure through transformations.