The maximum likelihood estimator (MLE) is a technology. Under regularity conditions, any MLE is asymptotically normal: \(\hat\theta \sim N(\theta, I(\theta)^{-1})\), where \(I(\theta)\) is the Fisher information matrix. This fact is remarkable because it holds regardless of the underlying model – exponential, Weibull, normal, or anything else. The MLE reduces the problem of inference to a mean vector and a covariance matrix.
algebraic.mle is the algebra that follows from this
fact. It does not find MLEs – it takes them as input and provides
operations for composition, transformation, and inference. Given two
independent experiments, joint() composes their MLEs into a
single joint estimator with block-diagonal covariance. Given three labs
estimating the same quantity, combine() produces the
optimal inverse-variance weighted estimate. Given a transformation \(g(\theta)\), rmap() propagates
uncertainty through the delta method. And as_dist() bridges
to the distribution algebra in algebraic.dist, where normal
distributions can be further composed.
This vignette walks through each operation with concrete examples, culminating in a full pipeline from independent experiments to system reliability inference.
The package provides three constructors for wrapping estimation
results into the MLE algebra. Each produces an object that supports the
full interface: params(), vcov(),
confint(), se(), AIC(), and the
algebraic operations.
mle()When you know the point estimate and its variance-covariance matrix (e.g., from analytical results or published tables), construct an MLE directly:
mle_numerical()The most common path: maximize a log-likelihood with
optim(), then wrap the result. The Hessian at the optimum
gives the variance-covariance matrix.
set.seed(42)
x <- rexp(80, rate = 2)
loglik <- function(rate) {
if (rate <= 0) return(-Inf)
sum(dexp(x, rate = rate, log = TRUE))
}
result <- optim(
par = c(lambda = 1),
fn = loglik,
method = "Brent", lower = 0.01, upper = 20,
hessian = TRUE,
control = list(fnscale = -1)
)
fit_num <- mle_numerical(result, options = list(nobs = length(x)))
params(fit_num)
#> [1] 1.687
se(fit_num)
#> [1] 0.1886mle_boot()When the sample is small or regularity conditions are questionable,
bootstrap the MLE and wrap the boot object:
set.seed(42)
y <- rexp(30, rate = 2)
boot_result <- boot::boot(
data = y,
statistic = function(d, i) c(lambda = 1 / mean(d[i])),
R = 999
)
fit_boot <- mle_boot(boot_result)
params(fit_boot)
#> lambda
#> 2.019
se(fit_boot)
#> [1] 0.4629
confint(fit_boot, type = "perc")
#> 2.5% 97.5%
#> lambda 1.367 3.175When independent experiments measure different parameters of a
system, joint() composes their MLEs into a single joint
estimator. The result has a block-diagonal variance-covariance matrix
because the experiments are independent – there is no cross-covariance
between their parameter estimates.
Motivating example: A series system has two components. Lab A tests component 1 and estimates its failure rate \(\lambda_1\). Lab B tests component 2 and estimates \(\lambda_2\). Neither lab knows about the other component, but we need the joint parameter vector \((\lambda_1, \lambda_2)\) for system-level inference.
# Lab A: component 1 failure rate
fit_A <- mle(
theta.hat = c(lambda1 = 0.02),
sigma = matrix(0.001^2),
nobs = 150L
)
# Lab B: component 2 failure rate
fit_B <- mle(
theta.hat = c(lambda2 = 0.05),
sigma = matrix(0.003^2),
nobs = 80L
)
# Joint MLE: block-diagonal covariance
fit_joint <- joint(fit_A, fit_B)
params(fit_joint)
#> lambda1 lambda2
#> 0.02 0.05
vcov(fit_joint)
#> [,1] [,2]
#> [1,] 1e-06 0e+00
#> [2,] 0e+00 9e-06The off-diagonal entries are zero – the hallmark of independence. The joint MLE supports the full interface:
confint(fit_joint)
#> 2.5% 97.5%
#> lambda1 0.01804 0.02196
#> lambda2 0.04412 0.05588
se(fit_joint)
#> [1] 0.001 0.003Use marginal() to recover individual component estimates
from the joint:
# Recover component 2 parameters
fit_B_recovered <- marginal(fit_joint, 2)
params(fit_B_recovered)
#> lambda2
#> 0.05
se(fit_B_recovered)
#> [1] 0.003joint() extends to any number of independent MLEs:
When multiple independent experiments estimate the
same parameter, combine() produces the
optimal estimator via inverse-variance weighting. The combined estimate
weights more precise estimates more heavily, and the combined variance
is always less than any individual variance.
Motivating example: Three labs independently estimate the failure rate \(\lambda\) of the same component type.
lab1 <- mle(theta.hat = c(lambda = 0.050), sigma = matrix(0.004^2), nobs = 50L)
lab2 <- mle(theta.hat = c(lambda = 0.048), sigma = matrix(0.002^2), nobs = 200L)
lab3 <- mle(theta.hat = c(lambda = 0.053), sigma = matrix(0.003^2), nobs = 100L)
combined <- combine(lab1, lab2, lab3)
params(combined)
#> lambda
#> 0.04961
se(combined)
#> [1] 0.001536The combined standard error is smaller than any individual standard error:
cat("Lab SEs: ", se(lab1), se(lab2), se(lab3), "\n")
#> Lab SEs: 0.004 0.002 0.003
cat("Combined SE: ", se(combined), "\n")
#> Combined SE: 0.001536The combined estimate is pulled toward the more precise estimates. Lab 2 has the smallest variance, so it receives the most weight.
When all estimates have equal precision, combine()
reduces to the simple average:
By the invariance property of the MLE, if \(\hat\theta\) is the MLE of \(\theta\), then \(g(\hat\theta)\) is the MLE of \(g(\theta)\) for any function \(g\). The rmap() function
computes this transformation and propagates uncertainty using the delta
method: \[
\text{Var}(g(\hat\theta)) \approx J \, \text{Var}(\hat\theta) \, J^\top,
\] where \(J\) is the Jacobian
of \(g\) evaluated at \(\hat\theta\).
An exponential component has rate \(\lambda\). The mean lifetime is \(\mu = 1/\lambda\). Transform the MLE:
fit_rate <- mle(
theta.hat = c(lambda = 0.02),
sigma = matrix(0.001^2),
nobs = 150L
)
# g: rate -> mean lifetime
fit_lifetime <- rmap(fit_rate, function(theta) c(mean_life = 1 / theta[1]),
method = "delta")
params(fit_lifetime)
#> mean_life
#> 50
se(fit_lifetime)
#> [1] 2.5
confint(fit_lifetime)
#> 2.5% 97.5%
#> mean_life 45.1 54.9For a series system with independent exponential components, the system reliability at time \(t\) is \[ R(t) = \exp(-\lambda_1 t) \cdot \exp(-\lambda_2 t) = \exp(-(\lambda_1 + \lambda_2) t). \]
Starting from the joint MLE of \((\lambda_1, \lambda_2)\):
fit_rates <- joint(
mle(theta.hat = c(lambda1 = 0.02), sigma = matrix(0.001^2), nobs = 150L),
mle(theta.hat = c(lambda2 = 0.05), sigma = matrix(0.003^2), nobs = 80L)
)
# System reliability at t = 10 hours
t0 <- 10
R_mle <- rmap(fit_rates,
function(theta) c(R_t = exp(-(theta[1] + theta[2]) * t0)),
method = "delta"
)
params(R_mle)
#> R_t
#> 0.4966
se(R_mle)
#> [1] 0.0157
confint(R_mle)
#> 2.5% 97.5%
#> R_t 0.4658 0.5274The delta method automatically computes the Jacobian via numerical
differentiation (using numDeriv::jacobian) and propagates
the covariance.
as_dist() converts an MLE to its asymptotic normal (or
multivariate normal) distribution, bridging into the
algebraic.dist package where distributions can be further
composed.
fit1 <- mle(theta.hat = c(lambda = 0.02), sigma = matrix(0.001^2))
d1 <- as_dist(fit1)
d1
#> Normal distribution (mu = 0.02, var = 1e-06)This is useful when you want to reason about the sampling distribution of the MLE as a distribution object. For instance, you can compute the probability that the true rate exceeds a regulatory threshold:
For bootstrap MLEs, as_dist() returns an empirical
distribution built from the bootstrap replicates:
Here is an end-to-end example tying together all the algebraic operations.
Scenario: A series system has two independent components with exponential lifetimes. Separate accelerated life tests estimate each component’s failure rate. We want to infer the system reliability at a mission time of \(t = 50\) hours, including a confidence interval.
set.seed(123)
# --- Step 1: Independent MLEs from separate experiments ---
# Component 1: 200 observed lifetimes
x1 <- rexp(200, rate = 0.003)
loglik1 <- function(rate) {
if (rate <= 0) return(-Inf)
sum(dexp(x1, rate = rate, log = TRUE))
}
fit1 <- mle_numerical(
optim(par = c(lambda1 = 0.001), fn = loglik1,
method = "Brent", lower = 1e-6, upper = 1,
hessian = TRUE, control = list(fnscale = -1)),
options = list(nobs = length(x1))
)
# Component 2: 120 observed lifetimes
x2 <- rexp(120, rate = 0.008)
loglik2 <- function(rate) {
if (rate <= 0) return(-Inf)
sum(dexp(x2, rate = rate, log = TRUE))
}
fit2 <- mle_numerical(
optim(par = c(lambda2 = 0.001), fn = loglik2,
method = "Brent", lower = 1e-6, upper = 1,
hessian = TRUE, control = list(fnscale = -1)),
options = list(nobs = length(x2))
)
cat("Component 1 rate:", params(fit1), " SE:", se(fit1), "\n")
#> Component 1 rate: 0.002978 SE: 0.0001827
cat("Component 2 rate:", params(fit2), " SE:", se(fit2), "\n")
#> Component 2 rate: 0.007983 SE: 0.0007171# --- Step 2: Compose into joint parameter space ---
fit_system <- joint(fit1, fit2)
cat("Joint parameters:", params(fit_system), "\n")
#> Joint parameters: 0.002978 0.007983
cat("Joint vcov:\n")
#> Joint vcov:
vcov(fit_system)
#> [,1] [,2]
#> [1,] 3.336e-08 0.000e+00
#> [2,] 0.000e+00 5.142e-07# --- Step 3: Transform to system reliability at t = 50 ---
mission_time <- 50
R_system <- rmap(fit_system,
function(theta) c(R_sys = exp(-(theta[1] + theta[2]) * mission_time)),
method = "delta"
)
cat("System reliability R(50):", params(R_system), "\n")
#> System reliability R(50): 0.5781
cat("SE of R(50): ", se(R_system), "\n")
#> SE of R(50): 0.02139# --- Step 4: Inference ---
cat("95% CI for R(50):\n")
#> 95% CI for R(50):
confint(R_system)
#> 2.5% 97.5%
#> R_sys 0.5361 0.62# --- Step 5: Bridge to distribution algebra ---
R_dist <- as_dist(R_system)
cat("Asymptotic distribution of R(50):", "\n")
#> Asymptotic distribution of R(50):
R_dist
#> Normal distribution (mu = 0.578067, var = 0.000457439)
# Probability that system reliability exceeds 55%
cat("P(R(50) > 0.55):", 1 - cdf(R_dist)(0.55), "\n")
#> P(R(50) > 0.55): 0.9053The pipeline reads as a chain of algebraic operations: two
independent MLEs are composed via joint(), transformed to
the quantity of interest via rmap(), and then reasoned
about as a distribution via as_dist(). Each step preserves
the uncertainty structure inherited from the original experiments.