Introduction
One of the most important quantities in reliability theory and survival analysis is the failure rate, often denoted by λ. For a system with a random lifetime T, its failure rate over some interval (t,t+Δt) is the ratio of the probability of failure in that interval (t,t+Δt) given that the system survived up to the time t divided by the length of the interval Δt. That is, λ=Pr(T≤t+Δt|T>t)Δt.
For instance, if we have a system with a failure rate of λ over the interval (t,t+Δt), then the probability of failure over this interval, given that it has survived up to time t, is λΔt. Of course, for most systems, the failure rate changes over time. For instance, a system may be more likely to fail when it is new, or when it is old.
At the limit, as Δt→0, we have what is known as the instaneous failure rate, or the hazard function,
h(t)=limΔt→0Pr{T≤t+Δt|T>t}Δt.
We can rewrite the above expression as h(t)=limΔt→0Pr{t<T≤t+Δt}Pr{T>t}Δt=limΔt→0(F(t+Δt)−F(t)Δt)11−F(t)=f(t)S(t),
The concept of the failure rate or hazard function is the basis of many important lifetime models in survival analysis and reliability theory. In what follows, we consider the dynamic failure rate (DFR) family of distributions, which can be a function of both time and any other predictors or covariates, i.e., h(t,x) may change with respect to both t and the covariate vector x.
Dynamic failure rate (DFR) distribution
Now, we consider the case where the failure rate is not constant. In fact, we generalize it to be a function of both time and any other predictors, say t and x where x is a vector of covariates (x1,…,xp). We denote this function hazard function by h(t,x).
First, we define the cumulative hazard function. It is defined as H(t,x)=∫t0h(u,x)du,
There is a well-known relationship between the survival function, the harzard function, and the pdf, h(t,x)=f(t|x)S(t|x).
Common failure rate distributions
In what follows, we show how two well-known distributions, the exponential and the Weibull, are defined in this framework.
Exponential distribution: h(t,x)=h(t)=λ.
Weibull distribution: h(t,x)=h(t)=(kλ)(tλ)k−1.
Of course, we can parameterize any family of distributions in terms of the hazard function. Sometimes, the parameterization yields an analytical form for the cumulative hazard function, which is useful in computing the other related distribution functions, like the survival and pdf.
Even simple hazard functions, when combined with other predictors, can yield complex distribution functions, but first let’s consider a simpler case where the hazard function is only a function of some other predictor x, h(t,x)=h(x)=exp(β0+β1x),
Things become more complicated if we have a hazard function in which t and x interact, e.g., h(t,x)=exp(β0+β1t+β1x+β12xt).
Likelihood function and maximum likelihood estimation
The likelihoon function is a sufficient statistic for the parameters of the distribution. The likelihood function for an observation of an exact failure time is given by ℓi(θ|ti,xi)=f(ti|xi,θ),
If we have a sample in which n are exact failure times, the total likelihood function over these observations is given by ℓ(θ|t,X)=n∏i=1ℓi(θ|ti,Xi:),
It is more convenient to work with the log-likelihood function normally, so we define the log-likelihood contribution as Li=logℓi.
- The support of the distribution does not depend on the parameter θ.
- The parameter space is open.
- The gradient of the log-likelihood function with respect to θ exists and is continuous.
- The expectation of the gradient of the log-likelihood function with respect to θ is zero.
For our DFR (dynamic failure rate) model, the i-th log-likelihood contribution for the exact failure time is given by ℓi=h(ti,xi)S(ti|xi)=h(ti,xi)exp(−H(ti,xi))
Score of the likelihood function
The score of the i-th obserevation for an exact failure time is just the gradient of the i-th log-likelihood contribution, $$ si=∇θf(ti|xi).
Let’s rewrite this to get it into a nicer form. First, let’s substitute in the definition of the log-likelihood contribution, si=∇θlogh(ti,xi)−∇θH(ti,xi).
The hessian of the log-likelihood function, or the Jacobian of the score, with respect to θ is given by ∇θsi=∇θ∇θh(ti,xi)h(ti,xi)−∇θ∇θh(ti,xi)=∇2θh(ti,xi)h(ti,xi)−∇θh(ti,xi)∇θh(ti,xi)h(ti,xi)2−∇θ∇θh(ti,xi).
This is an important result for the MLE, since it allows us to evaluate the variance-covariance of the sampling distribution of the MLE, in addition to other imporatnt quantities the MLEs sensitivity to sampling error. However, normally, we just numerically compute the Hessian and there are many important algorithms that do this for us, particularly for efficiently approximating it for solving efficiently solving for a linear approximation of the the root of the score function.
Simulation
We construct a data frame showing mean and variance for each way three ways of it, using mean and vcov, using expectation (defined for all dist objects using Monte Carlo integration), and using the analytical formulae for the mean and variance of the exponential and the definitions of each, and using the sample mean and sample variance.
Installation
We load the R package dfr.dist which contains all of the relevant material in this section with:
if (!require(dfr.dist)) {
devtools::install_github("queelius/dfr_dist")
}
#> Loading required package: dfr.dist
library(dfr.dist)
library(algebraic.dist)
#>
#> Attaching package: 'algebraic.dist'
#> The following object is masked from 'package:grDevices':
#>
#> pdfexpdist <- dfr_dist(rate = function(t, par, ...) par, par = 1)
is_dfr_dist(expdist)
#> [1] TRUE
params(expdist, 100)
#> [1] 1
params(expdist)
#> [1] 1
h <- hazard(expdist)
h
#> function (t, par = NULL, ...)
#> {
#> par <- params(x, par)
#> x$rate(t, par, ...)
#> }
#> <bytecode: 0x5625ae18d690>
#> <environment: 0x5625ae18ce78>n <- 1000
sup(expdist)
mu.hat <- algebraic.dist::expectation(
x = expdist,
g = function(x) x, n = n)
(var.hat <- expectation(dfr.dist,
function(x) (x - mu.hat)^2, n = n)$value)
df <- data.frame(
analytical = c(mean(dfr.dist),
vcov(dfr.dist)),
sample = c(mean(data), var(data))
)
row.names(df) <- c("mean", "variance")
print(df)