% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dglmstarma.R
\name{dglmstarma}
\alias{dglmstarma}
\title{Fit STARMA Models based on double generalized linear models}
\usage{
dglmstarma(
  ts,
  mean_model = list(),
  dispersion_model = list(),
  mean_family = NULL,
  dispersion_link = c("log", "identity", "inverse"),
  wlist = NULL,
  mean_covariates = list(),
  dispersion_covariates = list(),
  pseudo_observations = c("deviance", "pearson"),
  wlist_past_mean = NULL,
  wlist_covariates = NULL,
  wlist_pseudo_obs = NULL,
  wlist_past_dispersion = NULL,
  wlist_covariates_dispersion = NULL,
  control = list()
)
}
\arguments{
\item{ts}{Multivariate time series. Rows indicate the locations and columns the time.}

\item{mean_model}{a named list specifying the model orders of the linear predictor, which can be of the following elements:
\itemize{
\item \code{intercept} : (Optional) character
\itemize{
\item \code{'homogenous'} (default) for a homogenous model, i.e. the same intercept for all components
\item \code{'inhomogenous'} for inhomogenous models, i.e. fitting an individual intercept for each component
}
\item \code{past_obs} : (Optional)
\itemize{
\item Integer vector with the maximal spatial orders for the time lags in \code{past_obs_time_lags}.
\item Alternatively: a binary matrix, with the entry in row \eqn{i} and column \eqn{j} indicating whether the \eqn{(i - 1)}-spatial lag for the \eqn{j}-th time lag is included in the model.
}
\item \code{past_obs_time_lags} : (Optional) integer vector
\itemize{
\item indicates the time lags for \code{past_obs}. Defaults to \code{seq(length(past_obs))} (for vectors) and \code{seq(ncol(past_obs))} (for a matrix)
}
\item \code{past_mean} : (Optional)
\itemize{
\item Spatial orders for the regression on past values of (latent) linear process values.
\item Values can be entered in the same format as in \code{past_obs}. If not specified, no regression to the feedback process is performed.
}
\item \code{past_mean_time_lags} : (Optional) integer vector
\itemize{
\item Time lags for the regression on the (latent) linear process. Values can be entered in the same format as in \code{past_obs_time_lags}.
}
\item \code{covariates} : (Optional)
\itemize{
\item spatial orders for the covariate processes passed in the argument \code{covariates}. The values can be passed as in \code{past_obs} and \code{past_means}, where the \eqn{j}-th entry or column represents the \eqn{j}-th covariate.
\item Default is spatial order 0 for all covariates, which corresponds to the first matrix in argument \code{wlist_covariates}.
}
}}

\item{dispersion_model}{a named list specifying the model orders of the dispersion linear predictor, which can have the same elements as the \code{mean_model} argument.}

\item{mean_family}{A list generated by one of the family functions of this package, see \code{\link{stfamily}}. This argument specifies the marginal conditional distributions of the observations and the type of model fitted for the mean linear predictor.}

\item{dispersion_link}{Link function that is used for the dispersion model. Available options are "log" (default), "identity" and "inverse".}

\item{wlist}{A list of quadratic matrices, with the same dimension as the time series has rows, which describe the spatial dependencies. Row-normalized matrices are recommended. See Details.}

\item{mean_covariates}{List of covariates for the mean linear predictor, containing matrices of same dimension as \code{ts} or returns of the covariate functions of this package (see also \code{\link{TimeConstant}}, \code{\link{SpatialConstant}}).}

\item{dispersion_covariates}{List of covariates for the dispersion linear predictor, containing matrices of same dimension as \code{ts} or returns of the covariate functions of this package (see also \code{\link{TimeConstant}}, \code{\link{SpatialConstant}}).}

\item{pseudo_observations}{(character vector) Defines how pseudo observations for the past dispersion values are calculated. Options are "deviance" (default) and "pearson". See Details.}

\item{wlist_past_mean}{(Optional) List of matrices, which describes spatial dependencies for the values of the linear predictor. If this is \code{NULL}, the matrices from \code{wlist} are used.}

\item{wlist_covariates}{(Optional) List of matrices, which describes spatial dependencies for the covariates. If this is \code{NULL}, the matrices from \code{wlist} are used.}

\item{wlist_pseudo_obs}{(Optional) List of matrices, which describes spatial dependencies for past values of the pseudo observations. If this is \code{NULL}, the matrices from \code{wlist} are used.}

\item{wlist_past_dispersion}{(Optional) List of matrices, which describes spatial dependencies for the past dispersion values (latent process). If this is \code{NULL}, the matrices from \code{wlist} are used.}

\item{wlist_covariates_dispersion}{(Optional) List of matrices, which describes spatial dependencies for the covariates in the dispersion model. If this is \code{NULL}, the matrices from \code{wlist} are used.}

\item{control}{A list of parameters for controlling the fitting process. This list is passed to \code{\link{dglmstarma.control}}.}
}
\value{
The function returns an object of class \code{dglmstarma}, which includes
\itemize{
\item \code{mean} A list containing information about the mean model, see \code{\link{glmstarma}} for details. Additionally, it contains:
\itemize{
\item \code{param_history} The sequence of parameters estimates of the mean model during the fitting process.
\item \code{log_likelihood_history} The sequence of log-likelihood evaluations of the mean model during the fitting process.
}
\item \code{dispersion} Information about the dispersion model, see \code{\link{glmstarma}} for details. In \code{ts} it stores the final pseudo-observations. Additionally, it contains:
\itemize{
\item \code{pseudo_type} Type of pseudo observations used ("deviance" or "pearson").
\item \code{param_history} The sequence of parameters estimates of the dispersion model during the fitting process.
\item \code{log_likelihood_history} The sequence of log-likelihood evaluations of the dispersion model during the fitting process.
}
\item \code{target_dim} Number of parameters in the model.
\item \code{algorithm_info} Information about the fitting algorithm for each iteration of the inner fitting loop.
\item \code{convergence_info}  Information about the convergence of the inner fitting loop.
\item \code{total_log_likelihood_history} Evolution of the log-likelihood during the fitting process.
\item \code{total_log_likelihood} The final log-likelihood of the fitted model.
\item \code{aic} AIC of the (full) model based on the log-likelihood, see \link{information_criteria}.
\item \code{bic} BIC of the (full) model based on the log-likelihood, see \link{information_criteria}.
\item \code{qic} QIC of the (full) model based on the log-likelihood, see \link{QIC}.
\item \code{call} The function call.
\item \code{control} The control parameters used for fitting the model.
}
}
\description{
The function \code{dglmstarma} estimates a multivariate time series model based on double generalized linear models (DGLM) introduced by Smyth (1989). The primary application is for spatio-temporal data, but different applications, such as network data, are also feasible.
Conditionally on the past, each component of the multivariate time series is assumed to follow a distribution from the exponential dispersion family, see Jørgensen (1987).
In contrast to standard generalized linear models, the dispersion parameter of the distribution is allowed to vary.
The model framework links the mean of the time series conditional on the past, to a linear predictor. This linear predictor allows regression on past observations, past values of the linear predictor and covariates, as described in the details.
Additionally, the dispersion parameter of the distribution is modeled with an additional linear predictor, which can also include spatial and temporal dependencies as well as covariates.
Various distributions with several link-functions are available.
}
\details{
For a multivariate time series \eqn{\{Y_t = (Y_{1,t}, \ldots, Y_{p,t})'\}}, we assume that the (marginal) conditional components \eqn{Y_{i,t} \mid \mathcal{F}_{t-1}}, on the past, follow a distribution that is a member of the exponential dispersion family.
The joint multivariate distribution of \eqn{Y_t \mid \mathcal{F}_{t-1}} is assumed to be generated by a process involving copulas. The distributional assumptions imply that the conditional mean \eqn{\mathbf{\mu}_t := \mathbb{E}(Y_t \mid \mathcal{F}_{t-1})} and the conditional variance \eqn{\mathrm{Var}(Y_{i,t} \mid \mathcal{F}_{t-1}) = \phi_{i,t} V(\mu_{i,t})}, where \eqn{V(\cdot)} is the variance function of the chosen distribution and \eqn{\phi_{i,t}} is the dispersion parameter for location \eqn{i} at time \eqn{t}.
The conditional mean is linked to a linear process by the link-function, i.e. \eqn{g(\mathbf{\mu}_t) = \mathbf{\psi}_t}, which is applied elementwise.
A second linear process \eqn{\mathbf{\zeta}_t} is linked to the dispersion parameters of the distributions via a second link-function \eqn{g_d}, i.e. \eqn{g_d(\mathbf{\phi}_t) = \mathbf{\zeta}_t}.
The linear predictor for the mean process is defined by regression on past observations, past values of the linear predictor and covariates. It has the following structure:
\deqn{\mathbf{\psi}_t = \mathbf{\delta} + \sum_{i = 1}^{q} \sum_{\ell = 0}^{a_i} \alpha_{i, \ell} W_{\alpha}^{(\ell)} h(\mathbf{\psi}_{t - i}) + \sum_{j = 1}^{r} \sum_{\ell = 0}^{b_j} \beta_{j, \ell} W_{\beta}^{(\ell)} \tilde{h}(\mathbf{Y}_{t - j}) + \sum_{k = 1}^{m} \sum_{\ell = 0}^{c_k} \gamma_{k, \ell} W_{\gamma}^{(\ell)} \mathbf{X}_{k, t},}
where the matrices \eqn{W_{\alpha}^{(\ell)}}, \eqn{W_{\beta}^{(\ell)}}, and \eqn{W_{\gamma}^{(\ell)}} are taken from the lists \code{wlist_past_mean}, \code{wlist}, and \code{wlist_covariates}, respectively, and \eqn{\ell} denotes the spatial order.
If \eqn{\delta = \delta_0 \mathbf{1}} with a scalar \eqn{\delta_0}, the model is called homogenous with respect to the intercept; otherwise, it is inhomogenous.
Spatial orders, intercept structure and time lags for the mean model are specified in the argument \code{mean_model}. If \code{past_mean} is specified, it is also required that \code{past_mean} is specified for identifiability.

The linear process of the dispersion model is defined similarly, but instead of direct observations it includes pseudo observations \eqn{\mathbf{d}_t}, which are either defined based on deviance or Pearson residuals.
The linear process of the dispersion model has the following structure:
\deqn{\mathbf{\zeta}_t = \mathbf{\tilde{\delta}} + \sum_{i = 1}^{\tilde{q}} \sum_{\ell = 0}^{\tilde{a}_i} \tilde{\alpha}_{i, \ell} W_{\alpha, \phi}^{(\ell)} \mathbf{\zeta}_{t - i} + \sum_{j = 1}^{\tilde{r}} \sum_{\ell = 0}^{\tilde{b}_j} \tilde{\beta}_{j, \ell} W_{\beta, \phi}^{(\ell)} \tilde{h}_{\phi}(\mathbf{d}_{t - j}) + \sum_{k = 1}^{\tilde{m}} \sum_{\ell = 0}^{\tilde{c}_k} \tilde{\gamma}_{k, \ell} W_{\gamma, \phi}^{(\ell)} \mathbf{\tilde{X}}_{k, t},}
The model orders, neighborhood structures are specified analogously to the mean model via the argument \code{dispersion_model} and the \code{wlist_} arguments.

The unknown parameters of the model are estimated with an iterative procedure, which alternates between estimating the mean and dispersion model until convergence.
Within each step, a quasi-maximum likelihood approach is used, where for the mean model the quasi-log-likelihood of the observations resulting from the \code{mean_family} argument is maximized. For the dispersion model, the quasi-likelihood resulting from a Gamma-Density with fixed dispersion parameter of 2 is maximized.

In case of a negative binomial family, the pseudo-observations are always calculated based on Pearson residuals using the Poisson variance function. The dispersion model is defined on these pseudo-observations, which have expectation \eqn{1 + \phi_{i, t} \mu_{i,t}}.
}
\examples{
\donttest{
dat <- load_data("chickenpox", directory = tempdir())
chickenpox <- dat$chickenpox
population_hungary <- dat$population_hungary
W_hungary <- dat$W_hungary

model_autoregressive <- list(past_obs = rep(1, 7))
dglmstarma(chickenpox, model_autoregressive, dispersion_model = list(past_obs = 1),
           mean_covariates = list(population = population_hungary),
           wlist = W_hungary, mean_family = vquasipoisson("log"))
}
}
\references{
\itemize{
\item Jørgensen, B. (1987), Exponential Dispersion Models. Journal of the Royal Statistical Society: Series B (Methodological), 49: 127-145. \doi{10.1111/j.2517-6161.1987.tb01685.x}
\item Smyth, G.K. (1989), Generalized Linear Models with Varying Dispersion. Journal of the Royal Statistical Society: Series B (Methodological), 51: 47-60. \doi{10.1111/j.2517-6161.1989.tb01747.x}
}
}
\seealso{
\code{\link{stfamily}}, \code{\link{glmstarma.control}}, \code{\link{dglmstarma}}, \code{\link{TimeConstant}}, \code{\link{SpatialConstant}}
}
