This vignette illustrates the basic functionality of the SuperGauss package by simulating a few stochastic processes and estimating their parameters from regularly spaced data.
A one-dimensional fractional Brownian motion (fBM) Xt=X(t) is a continuous Gaussian process with E[Xt]=0 and cov(Xt,Xs)=12(|t|2H+|s|2H−|t−s|2H), for 0<H<1. fBM is not stationary but has stationary increments, such that (Xt+Δt−Xt)D=(Xs+Δt−Xs) for any s,t. As such, its covariance function is completely determined its mean squared displacement (MSD) MSDX(t)=E[(Xt−X0)2]=|t|2H. When the Hurst parameter H=12, fBM reduces to ordinary Brownian motion.
The following R code generates 5 independent fBM realizations of length N=5000 with H=0.3. The timing of the “superfast” method (Wood and Chan 1994) provided in this package is compared to that of a “fast” method (e.g., Brockwell and Davis 1991) and to the usual method (Cholesky decomposition of an unstructured variance matrix).
require(SuperGauss)
5000 # number of observations
N <- 1/60 # time between observations (seconds)
dT <- .3 # Hurst parameter
H <-
(0:N)*dT # times at which to sample fBM
tseq <- 5 # number of fBM paths to generate
npaths <-
# to generate fbm, generate its increments, which are stationary
fbm_msd(tseq = tseq[-1], H = H)
msd <- msd2acf(msd = msd) # convert msd to acf
acf <-
# superfast method
system.time({
rnormtz(n = npaths, acf = acf, fft = TRUE)
dX <- })
## user system elapsed
## 0.011 0.002 0.014
# fast method (about 3x as slow)
system.time({
rnormtz(n = npaths, acf = acf, fft = FALSE)
})
## user system elapsed
## 0.057 0.000 0.057
# unstructured variance method (much slower)
system.time({
matrix(rnorm(N*npaths), npaths, N) %*% chol(toeplitz(acf))
})
## user system elapsed
## 17.708 0.276 18.047
# convert increments to position measurements
apply(rbind(0, dX), 2, cumsum)
Xt <-
# plot
c("black", "red", "blue", "orange", "green2")
clrs <-par(mar = c(4.1,4.1,.5,.5))
plot(0, type = "n", xlim = range(tseq), ylim = range(Xt),
xlab = "Time (s)", ylab = "Position (m)")
for(ii in 1:npaths) {
lines(tseq, Xt[,ii], col = clrs[ii], lwd = 2)
}
Suppose that \boldsymbol{X}= (N_{X},\ldots,N_{)} are equally spaced observations of an fBM process with X_i = X(i \Delta t), and let \Delta\boldsymbol{X}= (N-1_{\Delta X},\ldots,N-1_{)} denote the corresponding increments, \Delta X_i = X_{i+1} - X_i. Then the loglikelihood function for H is \ell(H \mid \Delta\boldsymbol{X}) = -\tfrac 1 2 \big(\Delta\boldsymbol{X}' \boldsymbol{V}_H^{-1} \Delta\boldsymbol{X}+ \log |\boldsymbol{V}_H|\big), where V_H is a Toeplitz matrix, \boldsymbol{V}_H= [\mathrm{cov}(\Delta X_i, \Delta X_j)]_{0 \le i,j < N} = \begin{bmatrix} \gamma_0 & \gamma_1 & \cdots & \gamma_{N-1} \\ \gamma_1 & \gamma_0 & \cdots & \gamma_{N-2} \\ \vdots & \vdots & \ddots & \vdots \\ \gamma_{N-1} & \gamma_{N-2} & \cdots & \gamma_0 \end{bmatrix}. Thus, each evaluation of the loglikelihood requires the inverse and log-determinant of a Toeplitz matrix, which scales as \mathcal O(N^2) with the Durbin-Levinson algorithm. The SuperGauss package implements an extended version of the Generalized Schur algorithm of Ammar and Gragg (1988), which scales these computations as \mathcal O(N \log^2 N). With careful memory management and extensive use of the FFTW library (Frigo and Johnson 2005), the SuperGauss implementation crosses over Durbin-Levinson at around N = 300.
Toeplitz
Matrix ClassThe bulk of the likelihood calculations in SuperGauss are handled by the Toeplitz
matrix class. A Toeplitz
object is created as follows:
# allocate and assign in one step
Toeplitz$new(acf = acf)
Tz <- Tz
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
# allocate memory only
Toeplitz$new(N = N)
Tz <- Tz
## Toeplitz matrix of size 5000
## acf: NULL
$set_acf(acf = acf) # assign later Tz
Its primary methods are illustrated below:
all(acf == Tz$get_acf()) # extract acf
## [1] TRUE
# matrix multiplication
rnorm(N)
z <- toeplitz(acf) %*% z # regular way
x1 <- Tz$prod(z) # with Toeplitz class
x2 <- Tz %*% z # with Toeplitz class overloading the `%*%` operator
x3 <-range(x1-x2)
## [1] -1.526557e-15 1.665335e-15
range(x2-x3)
## [1] 0 0
# system of equations
solve(toeplitz(acf), z) # regular way
y1 <- Tz$solve(z) # with Toeplitz class
y2 <- solve(Tz, z) # same thing but overloading `solve()`
y2 <-range(y1-y2)
## [1] -1.280398e-11 8.704149e-12
# log-determinant
determinant(toeplitz(acf))$mod
ld1 <- Tz$log_det() # with Toeplitz class
ld2 <- determinant(Tz) # same thing but overloading `determinant()`
ld2 <-# note: no $mod
c(ld1, ld2)
## [1] -12737.89 -12737.89
The following code shows how to obtain the maximum likelihood of H and its standard error for a given fBM path. The log-PDF of the Gaussian with Toeplitz variance matrix is obtained either with SuperGauss::dnormtz()
, or using the NormalToeplitz
class. The advantage of the latter is that it does not reallocate memory for the underlying Toeplitz
object at every likelihood evaulation.
For speed comparisons, the optimization underlying the MLE calculation is done both using the superfast Generalized Schur algorithm and the fast Durbin-Levinson algorithm.
diff(Xt[,1]) # obtain the increments of a given path
dX <- length(dX)
N <-
# autocorrelation of fBM increments
function(H) {
fbm_acf <- fbm_msd(1:N*dT, H = H)
msd <-msd2acf(msd)
}
# loglikelihood using generalized Schur algorithm
NormalToeplitz$new(N = N) # pre-allocate memory
NTz <- function(H) {
loglik_GS <-$logdens(z = dX, acf = fbm_acf(H))
NTz
}
# loglikelihood using Durbin-Levinson algorithm
function(H) {
loglik_DL <-dnormtz(X = dX, acf = fbm_acf(H), method = "ltz", log = TRUE)
}
# superfast method
system.time({
optimize(loglik_GS, interval = c(.01, .99), maximum = TRUE)
GS_mle <- })
## user system elapsed
## 0.067 0.003 0.070
# fast method (about 10x slower)
system.time({
optimize(loglik_DL, interval = c(.01, .99), maximum = TRUE)
DL_mle <- })
## user system elapsed
## 0.851 0.006 0.857
c(GS = GS_mle$max, DL = DL_mle$max)
## GS DL
## 0.2950746 0.2950746
# standard error calculation
require(numDeriv)
## Loading required package: numDeriv
GS_mle$max
Hmle <- -hessian(func = loglik_GS, x = Hmle) # observed Fisher Information
Hse <- sqrt(1/Hse[1])
Hse <-c(mle = Hmle, se = Hse)
## mle se
## 0.295074650 0.002584653
R6
ClassesIn order to effectively manage memory in the underlying C++ code, the Toeplitz
class is implemented using R6 classes. Among other things, this means that when a Toeplitz
object is passed to a function, the function does not make a copy of it: all modifications to the object inside the object are reflected on the object outside the function as well, as in the following example:
Toeplitz$new(N = N)
T1 <- T1 # shallow copy: both of these point to the same memory location
T2 <-
# affects both objects
$set_acf(fbm_acf(.5))
T1 T1
## Toeplitz matrix of size 5000
## acf: 0.0167 0 1.73e-18 -3.47e-18 0 6.94e-18 ...
T2
## Toeplitz matrix of size 5000
## acf: 0.0167 0 1.73e-18 -3.47e-18 0 6.94e-18 ...
function(H) {
fbm_logdet <-$set_acf(acf = fbm_acf(H))
T1$log_det()
T1
}
# affects both objects
fbm_logdet(H = .3)
## [1] -12737.89
T1
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
T2
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
To avoid this behavior, it is necessary to make a deep copy of the object:
T1$clone(deep = TRUE)
T3 <- T1
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
T3
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
# only affect T1
fbm_logdet(H = .7)
## [1] -29326.33
T1
## Toeplitz matrix of size 5000
## acf: 0.00324 0.00104 0.000612 0.000474 0.000397 0.000347 ...
T3
## Toeplitz matrix of size 5000
## acf: 0.0857 -0.0208 -0.00421 -0.00228 -0.0015 -0.00109 ...
In addition to the superfast algorithm for Gaussian likelihood evaluations, SuperGauss provides such algorithms for the loglikelihood gradient and Hessian functions, leading to superfast versions of many inference algorithms such as Newton-Raphson and Hamiltonian Monte Carlo. These are provided by the NormalToeplitz$grad()
and NormalToeplitz$hess()
methods. Both of these methods optionally return the lower order derivatives as well, reusing common computations to improve performance. An example of Newton-Raphson is given below using the two-parameter exponential autocorrelation model
\mathrm{\scriptsize ACF}_X(t \mid \lambda, \sigma) = \sigma^2 \exp(- |t/\lambda|).
The example uses stats::nlm()
for optimization, which requires the derivatives to be passsed as attributes to the (negative) loglikelihood.
# autocorrelation function
function(t, lambda, sigma) sigma^2 * exp(-abs(t/lambda))
exp_acf <-# gradient, returned as a 2-column matrix
function(t, lambda, sigma) {
exp_acf_grad <- exp_acf(t, lambda, 1)
ea <-cbind(abs(t)*(sigma/lambda)^2 * ea, # d_acf/d_lambda
2*sigma * ea) # d_acf/d_sigma
}# Hessian, returned as an array of size length(t) x 2 x 2
function(t, lambda, sigma) {
exp_acf_hess <- exp_acf(t, lambda, 1)
ea <- sigma/lambda^2
sl2 <- array(NA, dim = c(length(t), 2, 2))
hess <-1,1] <- sl2^2*(t^2 - 2*abs(t)*lambda) * ea # d2_acf/d_lambda^2
hess[,1,2] <- 2*sl2 * abs(t) * ea # d2_acf/(d_lambda d_sigma)
hess[,2,1] <- hess[,1,2] # d2_acf/(d_sigma d_lambda)
hess[,2,2] <- 2 * ea # d2_acf/d_sigma^2
hess[,
hess
}
# simulate data
runif(1, .5, 2)
lambda <- runif(1, .5, 2)
sigma <- (1:N-1)*dT
tseq <- exp_acf(t = tseq, lambda = lambda, sigma = sigma)
acf <- rnormtz(acf = acf)
Xt <-
NormalToeplitz$new(N = N) # storage space
NTz <-
# negative loglikelihood function of theta = (lambda, sigma)
# include attributes for gradient and Hessian
function(theta) {
exp_negloglik <- theta[1]
lambda <- theta[2]
sigma <-# acf, its gradient, and Hessian
exp_acf(tseq, lambda, sigma)
acf <- exp_acf_grad(tseq, lambda, sigma)
dacf <- exp_acf_hess(tseq, lambda, sigma)
d2acf <-# derivatives of NormalToeplitz up to order 2
NTz$hess(z = Xt,
derivs <-dz = matrix(0, N, 2),
d2z = array(0, dim = c(N, 2, 2)),
acf = acf,
dacf = dacf,
d2acf = d2acf,
full_out = TRUE)
# negative loglikelihood with derivatives as attributes
-1 * derivs$ldens
nll <-attr(nll, "gradient") <- -1 * derivs$grad
attr(nll, "hessian") <- -1 * derivs$hess
nll
}
# optimization
system.time({
nlm(f = exp_negloglik, p = c(1,1), hessian = TRUE)
mle_fit <- })
## user system elapsed
## 0.431 0.009 0.439
# display estimates with standard errors
rbind(true = c(lambda = lambda, sigma = sigma),
est = mle_fit$estimate,
se = sqrt(diag(solve(mle_fit$hessian))))
## lambda sigma
## true 1.3914065 0.71412532
## est 1.7019320 0.79757410
## se 0.3478682 0.08112464
Ammar, G.S. and Gragg, W.B., 1988. Superfast solution of real positive definite Toeplitz systems. SIAM Journal on Matrix Analysis and Applications, 9 (1), 61–76.
Brockwell, P.J. and Davis, R.A., 1991. Time series: Theory and methods. Springer: New York.
Frigo, M. and Johnson, S.G., 2005. The design and implementation of FFTW3. Proceedings of the IEEE, 93 (2), 216–231.
Wood, A.T. and Chan, G., 1994. Simulation of stationary gaussian processes in [0, 1]^d. Journal of Computational and Graphical Statistics, 3 (4), 409–432.