Getting Started with algebraic.mle
This tutorial walks you through fitting models using maximum likelihood estimation and performing inference with the algebraic.mle package.
Installation
Install the development version from GitHub:
if (!require(devtools)) {
install.packages("devtools")
}
devtools::install_github("queelius/algebraic.mle")
You may also want the companion packages:
devtools::install_github("queelius/algebraic.dist")
devtools::install_github("queelius/likelihood.model")
Tutorial 1: Fitting a Conditional Exponential Model
Let’s fit a conditional exponential model where the rate depends on a covariate.
The Model
We’ll model:
where the rate function is:
Step 1: Generate Sample Data
First, let’s create a data generating process (DGP):
library(algebraic.mle)
set.seed(1231)
# True parameters
b0 <- -0.1
b1 <- 0.5
# DGP: Exponential + small Gaussian noise
dgp <- function(n, x) {
rate <- exp(b0 + b1 * x)
X <- rexp(n, rate)
W <- rnorm(n, 0, 1e-3) # Measurement noise
return(X + W)
}
# Generate observations
n <- 75
df <- data.frame(x = runif(n, -10, 10))
df$y <- sapply(df$x, function(xi) dgp(1, xi))
Step 2: Define the Log-Likelihood
# Response extractor
resp <- function(df) df$y
# Rate function (parameterized)
rate <- function(df, beta) exp(beta[1] + beta[2] * df$x)
# Log-likelihood function
loglik <- function(df, resp, rate) {
function(beta) {
sum(dexp(x = resp(df), rate = rate(df, beta), log = TRUE))
}
}
Step 3: Fit the Model
# Initial parameter guess
par0 <- c(b0 = 0, b1 = 0)
# Fit using optim, wrap in mle object
fit <- mle_numerical(optim(
par = par0,
fn = loglik(df, resp, rate),
control = list(fnscale = -1), # Maximize
hessian = TRUE
))
# View results
summary(fit)
Step 4: Inference
# Parameter estimates
params(fit)
# Confidence intervals
confint(fit)
# Standard errors
se(fit)
Step 5: Visualize the Fit
# Plot data
plot(df$x, df$y, xlab = "x", ylab = "y", main = "Conditional Exponential Fit")
# Add fitted conditional mean: E[Y|x] = 1/rate(x)
x_seq <- seq(-10, 10, 0.1)
b0_hat <- params(fit)[1]
b1_hat <- params(fit)[2]
y_hat <- 1 / exp(b0_hat + b1_hat * x_seq)
lines(x_seq, y_hat, col = "blue", lwd = 2)
# Add true conditional mean for comparison
y_true <- 1 / exp(b0 + b1 * x_seq)
lines(x_seq, y_true, col = "green", lwd = 2, lty = 2)
legend("topright", legend = c("Fitted", "True"),
col = c("blue", "green"), lwd = 2, lty = c(1, 2))
Tutorial 2: Model Comparison with Likelihood Ratio Tests
A powerful feature of MLE is the ability to compare nested models.
Testing a Restricted Model
Let’s test the hypothesis that β₀ = 0 (no intercept):
# Null model: rate(x) = exp(b1 * x)
rate_null <- function(df, b1) exp(b1 * df$x)
fit_null <- mle_numerical(optim(
par = 0,
fn = loglik(df, resp, rate_null),
control = list(fnscale = -1),
hessian = TRUE,
method = "BFGS"
))
# Likelihood ratio test
lrt_stat <- -2 * (loglik_val(fit_null) - loglik_val(fit))
p_value <- pchisq(lrt_stat, df = 1, lower.tail = FALSE)
cat("LRT statistic:", lrt_stat, "\n")
cat("p-value:", p_value, "\n")
Model Selection with AIC
cat("Full model AIC:", aic(fit), "\n")
cat("Null model AIC:", aic(fit_null), "\n")
# Lower AIC is better
if (aic(fit) < aic(fit_null)) {
cat("Full model is preferred\n")
} else {
cat("Null model is preferred\n")
}
Tutorial 3: Bootstrap Confidence Intervals
When the asymptotic normal approximation may not hold (small samples, boundary parameters), use bootstrap:
library(boot)
# Bootstrap fitting function
boot_fn <- function(data, indices) {
d <- data[indices, ]
fit <- mle_numerical(optim(
par = c(0, 0),
fn = loglik(d, resp, rate),
control = list(fnscale = -1),
hessian = TRUE
))
params(fit)
}
# Run bootstrap
boot_results <- boot(df, boot_fn, R = 1000)
# Bootstrap confidence intervals
boot.ci(boot_results, type = c("perc", "bca"), index = 1) # For b0
boot.ci(boot_results, type = c("perc", "bca"), index = 2) # For b1
Next Steps
- Explore the Examples for more use cases
- Read the API Documentation for complete function reference
- See likelihood.model for building complex likelihood specifications