Before diving into this vignette, we recommend reading the vignette Introduction to LaMa.
The regular HMM formulation needs a key assumption to be applicable, namely the data need to be observed at regular, equidistant time-points such that the transition probabilties can be interpreted meaningfully w.r.t. a specific time unit. If this is not the case, the model used should acocunt for this by building on a mathematical formulation in continuous time. The obvious choice here is to retain most of the HMM model formulation, but replace the unobserved discrete-time Markov chain with a continuous-time Markov chain. However, here it is important to note that the so-called snapshot property needs to be fulfilled, i.e. the observed process at time t can only depend on the state at that time instant and not on the interval since the previous observation. For more details see Glennie et al. (2023).
A continuous-time Markov chain is characterised by a so-called (infinitesimal) generator matrix Q=(q11q12⋯q1Nq21q22⋯q2N⋮⋮⋱⋮qN1qN2⋯qNN), where the diagonal entries are qii=−∑j≠iqij, qij≥0 for i≠j. This matrix can be interpreted as the derivative of the transition probability matrix and completely describes the dynamics of the state process. The time-spent in a state i is exponentially distributed with rate −qii and conditional on leaving the state, the probability to transition to a state j≠i is ωij=qij/−qii. For a more detailed introduction see Dobrow (2016) (pp. 265 f.). For observation times t1 and t2, we can then obtain the transition probability matrix between these points via the identity Γ(t1,t2)=exp(Q(t2−t1)), where exp() is the matrix expoential. This follows from the so-called Kolmogorov forward equations, but for more details see Dobrow (2016).
We start by setting parameters to simulate data. In this example, state 1 has a smaller rate and the state dwell time in state one follows and Exp(0.5) distribution, i.e. it exhibits longer dwell times than state 2 with rate 1.
We simulate the continuous-time Markov chain by drawing the exponentially distributed state dwell-times. Within a stay, we can assume whatever structure we like for the observation times, as these are not explicitly modeled. Here we choose to generate them by a Poisson process with rate λ=1, but this choice is arbitrary. For more details on Poisson point processes, see the MM(M)PP vignette.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:2, 1) # initial distribuion c(0.5, 0.5)
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
s[t] = c(1,2)[-s[t-1]] # for 2-states, always a state swith when transitioning
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
Let’s visualise the simulated continuous-time HMM:
color = c("orange", "deepskyblue")
n = length(obs_times)
plot(obs_times[1:50], x[1:50], pch = 16, bty = "n", xlab = "observation times",
ylab = "x", ylim = c(-5,25))
segments(x0 = c(0,trans_times[1:48]), x1 = trans_times[1:49],
y0 = rep(-5,50), y1 = rep(-5,50), col = color[s[1:49]], lwd = 4)
legend("topright", lwd = 2, col = color,
legend = c("state 1", "state 2"), box.lwd = 0)
The likelhood of a continuous-time HMM for observations xt1,…,xtT at irregular time
points t1,…,tT has the
exact same structure as the regular HMM likelihood: L(θ)=δ(1)Γ(t1,t2)P(xt2)Γ(t2,t3)P(xt3)…Γ(tT−1,tT)P(xtT)1t, where δ(1), P and 1t are as usual and Γ(tk,tk+1) is computed as
explained above. Thus we can fit such models using the standard
implementation of the general forward algorithm forward_g()
with time-varying transition probability matrices. We can use the
generator()
function to compute the infinitesimal generator
matrix from an unconstrained parameter vector and
tpm_cont()
to compute all matrix exponentials.
nll = function(par, timediff, x, N){
mu = par[1:N]
sigma = exp(par[N+1:N])
Q = generator(par[2*N+1:(N*(N-1))]) # generator matrix
Pi = stationary_cont(Q) # stationary dist of CT Markov chain
Qube = tpm_cont(Q, timediff) # this computes exp(Q*timediff)
allprobs = matrix(1, nrow = length(x), ncol = N)
ind = which(!is.na(x))
for(j in 1:N){
allprobs[ind,j] = dnorm(x[ind], mu[j], sigma[j])
}
-forward_g(Pi, Qube, allprobs)
}
The simulation is very similar but we now also have to draw which state to transition to, as explained in the beginning.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:3, 1) # uniform initial distribuion
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
# off-diagonal elements of the s[t-1] row of Q divided by the diagonal element
# give the probabilites of the next state
s[t] = sample(c(1:3)[-s[t-1]], 1, prob = Q[s[t-1],-s[t-1]]/-Q[s[t-1],s[t-1]])
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
par = c(mu = c(5, 10, 25), # state-dependent means
logsigma = c(log(2), log(2), log(6)), # state-dependent sds
qs = rep(0, 6)) # off-diagonals of Q
timediff = diff(obs_times)
system.time(
mod2 <- nlm(nll, par, timediff = timediff, x = x, N = 3, stepmax = 10)
)
#> user system elapsed
#> 0.338 0.008 0.347
# without restricting stepmax, we run into numerical problems