% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mscmt.r
\name{mscmt}
\alias{mscmt}
\title{Multivariate SCM Using Time Series}
\usage{
mscmt(data, treatment.identifier = NULL, controls.identifier = NULL,
  times.dep = NULL, times.pred = NULL, agg.fns = NULL, placebo = FALSE,
  placebo.with.treated = FALSE, univariate = FALSE,
  univariate.with.dependent = FALSE, check.global = TRUE,
  inner.optim = "wnnlsOpt", inner.opar = list(), outer.optim = "DEoptC",
  outer.par = list(), outer.opar = list(), std.v = c("sum", "mean", "min",
  "max"), alpha = NULL, beta = NULL, gamma = NULL, return.ts = TRUE,
  single.v = FALSE, verbose = TRUE, debug = FALSE, seed = NULL,
  cl = NULL, times.pred.training = NULL, times.dep.validation = NULL,
  v.special = integer(), cv.alpha = 0)
}
\arguments{
\item{data}{Typically, a list of matrices with rows corresponding to times 
and columns corresponding to units for all relevant features (dependent as
well as predictor variables, identified by the list elements' names).
This might be the result of converting from a 
\code{\link[base]{data.frame}}
by using function \code{\link{listFromLong}}. 

For convenience, \code{data} may alternatively be the 
result of function \code{\link[Synth]{dataprep}} of package 
\code{'Synth'}. In this case, the parameters \code{treatment.identifier},
\code{controls.identifier}, \code{times.dep}, \code{times.pred}, 
and \code{agg.fns} are ignored, as these input parameters are generated
automatically from \code{data}. The parameters \code{univariate}, 
\code{alpha}, \code{beta}, and \code{gamma} are ignored by fixing them to 
their defaults.
Using results of \code{\link[Synth]{dataprep}} is experimental, because
the automatic generation of input parameters may fail due to lack of 
information contained in results of \code{\link[Synth]{dataprep}}.}

\item{treatment.identifier}{A character scalar containing the name of the 
treated unit. 
Must be contained in the column names of the matrices in \code{data}.}

\item{controls.identifier}{A character vector containing the names of at 
least two control units.
Entries must be contained in the column names of the matrices in \code{data}.}

\item{times.dep}{A matrix with two rows (containing start times in
the first and end times in the second row) and one column for each dependent
variable, where the column names must exactly match the names of the
corresponding dependent variables. 
A sequence of dates with the given start and end times of
\itemize{
\item annual dates, if the format of start/end time is "dddd", e.g. "2016",
\item quarterly dates, if the format of start/end time is "ddddQd", e.g. 
"2016Q1",
\item monthly dates, if the format of start/end time is "dddd?dd", e.g. 
"2016/03" or "2016-10",
}
will be constructed; these dates are looked for in the row names of
the respective matrices in \code{data}. In applications with 
cross-validation, \code{times.dep} belongs to the main period.}

\item{times.pred}{A matrix with two rows (containing start times in
the first and end times in the second row) and one column for each predictor
variable, where the column names must exactly match the names of the
corresponding predictor variables.
A sequence of dates with the given start and end times of
\itemize{
\item annual dates, if the format of start/end time is "dddd", e.g. "2016",
\item quarterly dates, if the format of start/end time is "ddddQd", e.g. 
"2016Q1",
\item monthly dates, if the format of start/end time is "dddd?dd", e.g. 
"2016/03" or "2016-10",
}
will be constructed; these dates are looked for in the row names of
the respective matrices in \code{data}. In applications with 
cross-validation, \code{times.pred} belongs to the main period.}

\item{agg.fns}{Either \code{NULL} (default) or a character vector containing
one name of an aggregation function for each predictor variable (i.e., each
column of \code{times.pred}). The character string "id" may be used as a
"no-op" aggregation. Each aggregation function must accept a numeric vector
and return either a numeric scalar ("classical" MSCM) or a numeric vector 
(leading to MSCM*T* if length of vector is at least two).}

\item{placebo}{A logical scalar. If \code{TRUE}, a placebo study is 
performed where, apart from the treated unit, each control unit is considered
as treated unit in separate optimizations. Defaults to \code{FALSE}. 
Depending on the number of control units and the complexity of the problem, 
placebo studies may take a long time to finish.}

\item{placebo.with.treated}{A logical scalar. If \code{TRUE}, the treated
unit is included as control unit (for other treated units in placebo 
studies). Defaults to \code{FALSE}.}

\item{univariate}{A logical scalar. If \code{TRUE}, a series of univariate
SCMT optimizations is done (instead of one MSCMT optimization) even if
there is more than one dependent variable. Defaults to \code{FALSE}.}

\item{univariate.with.dependent}{A logical scalar. If \code{TRUE} (and if
\code{univariate} is also \code{TRUE}), all dependent variables (contained
in the column names of \code{times.dep}) apart from the current (real) 
dependent variable are included as predictors in the series of univariate
SCMT optimizations. Defaults to \code{FALSE}.}

\item{check.global}{A logical scalar. If \code{TRUE} (default), a check for
the feasibility of the unrestricted outer optimum (where actually no 
restrictions are imposed by the predictor variables) is made before 
starting the actual optimization procedure.}

\item{inner.optim}{A character scalar containing the name of the optimization
method for the inner optimization. Defaults to \code{"wnnlsOpt"}, which
(currently) is the only supported implementation, because it outperforms
all other inner optimizers we are aware of. 
\code{"ipopOpt"}, which uses \code{\link[kernlab]{ipop}}, and 
\code{LowRankQPOpt}, which uses \code{\link[LowRankQP]{LowRankQP}} as inner
optimizer, have experimental support for benchmark purposes.}

\item{inner.opar}{A list containing further parameters for the inner 
optimizer. Defaults to the empty list. (For \code{"wnnlsOpt"}, there are no
meaningful further parameters.)}

\item{outer.optim}{A character vector containing the name(s) of the 
optimization method(s) for the outer optimization. Defaults to 
\code{"DEoptC"}, which (currently) is the recommended global optimizer. 
The optimizers currently supported can be found in the documentation of
parameter \code{outer.opar}, where the default control parameters for
the various optimizers are listed.
If \code{outer.optim} has length greater
than 1, one optimization is invoked for each outer optimizer (and, 
potentially, each random seed, see below), and the best result is used.}

\item{outer.par}{A list containing further parameters for the outer 
optimization procedure. Defaults to the empty list. Entries in this list may 
override the following hard-coded general defaults:
\itemize{
\item \code{lb=1e-8}, corresponding to the lower bound for the ratio of
predictor weights,
\item \code{opt.separate=TRUE}, corresponding
to an improved outer optimization where each predictor is treated as the 
(potentially) most important predictor (i.e. with maximal weight) in 
separate optimizations (one for each predictor), see [1].
}}

\item{outer.opar}{A list (or a list of lists, if \code{outer.optim} has
length greater than 1) containing further parameters for the outer 
optimizer(s). Defaults to the empty list. Entries in this list may override
the following hard-coded defaults for the individual optimizers, which
are quite modest concerning the computing time. 
\code{dim} is a variable holding the problem dimension, 
typically the number of predictors minus one.
\tabular{lll}{
\bold{Optimizer}  \tab \bold{Package}     \tab \bold{Default parameters} \cr 
\code{DEoptC}     \tab \code{MSCMT}       \tab \code{nG=500}, \code{nP=20*dim}, \code{waitgen=100}, \cr
                  \tab                    \tab \code{minimpr=1e-14}, \code{F=0.5}, \code{CR=0.9} \cr
\code{cma_es}     \tab \code{cmaes}       \tab \code{maxit=2500} \cr
\code{crs}        \tab \code{nloptr}      \tab \code{maxeval=2.5e4}, \code{xtol_rel=1e-14}, \cr
                  \tab                    \tab \code{population=20*dim}, \code{algorithm="NLOPT_GN_CRS2_LM"} \cr
\code{DEopt}      \tab \code{NMOF}        \tab \code{nG=100}, \code{nP=20*dim} \cr
\code{DEoptim}    \tab \code{DEoptim}     \tab \code{nP=20*dim} \cr
\code{ga}         \tab \code{GA}          \tab \code{maxiter=50}, \code{monitor=FALSE}, \cr
                  \tab                    \tab \code{popSize=20*dim} \cr
\code{genoud}     \tab \code{rgenoud}     \tab \code{print.level=0}, \code{max.generations=70}, \cr
                  \tab                    \tab \code{solution.tolerance=1e-12}, \code{pop.size=20*dim}, \cr
                  \tab                    \tab \code{wait.generations=dim}, \code{boundary.enforcement=2}, \cr
                  \tab                    \tab \code{gradient.check=FALSE}, \code{MemoryMatrix=FALSE} \cr
\code{GenSA}      \tab \code{GenSA}       \tab \code{max.call=1e7}, \code{max.time=25/dim},  \cr
                  \tab                    \tab \code{trace.mat=FALSE} \cr
\code{hydroPSO}   \tab \code{hydroPSO}    \tab \code{maxit=300}, \code{reltol=1e-14}, \code{npart=3*dim} \cr
\code{isres}      \tab \code{nloptr}      \tab \code{maxeval=2e4}, \code{xtol_rel=1e-14}, \cr
                  \tab                    \tab \code{population=20*dim}, \code{algorithm="NLOPT_GN_ISRES"} \cr
\code{malschains} \tab \code{Rmalschains} \tab \code{popsize=20*dim}, \code{maxEvals=25000} \cr
\code{nlminbOpt}  \tab \code{MSCMT/stats} \tab \code{nrandom=30} \cr
\code{optimOpt}   \tab \code{MSCMT/stats} \tab \code{nrandom=25} \cr
\code{PSopt}      \tab \code{NMOF}        \tab \code{nG=100}, \code{nP=20*dim} \cr
\code{psoptim}    \tab \code{pso}         \tab \code{maxit=700} \cr
\code{soma}       \tab \code{soma}        \tab \code{nMigrations=100} 
}
If \code{outer.opar} is a list of lists, its names must correspond to (a 
subset of) the outer optimizers chosen in \code{outer.optim}.}

\item{std.v}{A character scalar containing one of the function names
"sum", "mean", "min", or "max" for the standardization of the predictor 
weights (weights are divided by \code{std.v(weights)} before reporting). 
Defaults to "sum", partial matching allowed.}

\item{alpha}{A numerical vector with weights for the dependent variables
in an MSCMT optimization or \code{NULL} (default). If not \code{NULL},
the length of \code{alpha} must agree with the number of dependent
variables, \code{NULL} is equivalent to weight 1 for all dependent 
variables.}

\item{beta}{Either \code{NULL} (default), a numerical vector, or a list.
If \code{beta} is a numerical vector or a list, its length must agree
with the number of dependent variables. 
\itemize{
\item If \code{beta} is a numerical vector,
the \code{i}th dependent variable is discounted with discount factor 
\code{beta[i]} (the observations of the dependent variables must thus be 
in chronological order!). 
\item If \code{beta} is a list, the components of \code{beta} must be 
numerical vectors with lengths corresponding to the numbers of observations 
for the individual dependent variables. These observations are then 
multiplied with the corresponding component of \code{beta}.
}}

\item{gamma}{Either \code{NULL} (default), a numerical vector, or a list.
If \code{gamma} is a numerical vector or a list, its length must agree
with the number of predictor variables. 
\itemize{
\item If \code{gamma} is a numerical vector,
the output of \code{agg.fns[i]} applied to the \code{i}th predictor variable
is discounted with discount factor \code{gamma[i]} (the output of 
\code{agg.fns[i]} must therefore be in chronological order!). 
\item If \code{gamma} is a list, the components of \code{gamma} must be 
numerical vectors with lengths corresponding to the lengths of the output of 
\code{agg.fns} for the individual predictor variables. The output of 
\code{agg.fns} is then multiplied with the corresponding component of 
\code{gamma}.
}}

\item{return.ts}{A logical scalar. If \code{TRUE} (default), most results are
converted to time series.}

\item{single.v}{A logical scalar. If \code{FALSE} (default), a selection
of feasible (optimal!) predictor weight vectors is generated. If \code{TRUE}, 
the one optimal weight vector which has maximal order statistics is generated 
to facilitate cross validation studies.}

\item{verbose}{A logical scalar. If \code{TRUE} (default), output is verbose.}

\item{debug}{A logical scalar. If \code{TRUE}, output is very verbose. 
Defaults to \code{FALSE}.}

\item{seed}{A numerical vector or \code{NULL}. If not \code{NULL}, the
random number generator is initialized with the elements of \code{seed} via
\code{set.seed(seed)} (see \link[base]{Random}) before
calling the optimizer, performing repeated optimizations (and staying with 
the best) if \code{seed} has length greater than 1. Defaults to \code{NULL}. 
If not \code{NULL}, the seeds \code{int.seed} (default: 53058) and 
\code{unif.seed} (default: 812821) for \code{\link[rgenoud]{genoud}} are 
also initialized to the corresponding element of \code{seed}, but this can 
be overridden with the list elements \code{int.seed} and \code{unif.seed} 
of (the corresponding element of) \code{outer.opar}.}

\item{cl}{\code{NULL} (default) or an object of class \code{cluster}
obtained by \code{\link[parallel]{makeCluster}} of package \code{parallel}. 
Repeated estimations (see \code{outer.optim} and \code{seed}) and
placebo studies will make use of the cluster \code{cl} (if not \code{NULL}).}

\item{times.pred.training}{A matrix with two rows (containing start times in
the first and end times in the second row) and one column for each predictor
variable, where the column names must exactly match the names of the
corresponding predictor variables (or \code{NULL} by default).
If not \code{NULL}, \code{times.pred.training} defines training periods
for cross-validation applications. For the format of the start and end times,
see the documentation of parameter \code{times.pred}.}

\item{times.dep.validation}{A matrix with two rows (containing start times in
the first and end times in the second row) and one column for each dependent
variable, where the column names must exactly match the names of the
corresponding dependent variables (or \code{NULL} by default). 
If not \code{NULL}, \code{times.dep.validation} defines validation period(s)
for cross-validation applications. For the format of the start and end times,
see the documentation of parameter \code{times.dep}.}

\item{v.special}{integer vector containing indices of important predictors
with special treatment (see below). Defaults to the empty set.}

\item{cv.alpha}{numeric scalar containing the minimal proportion (of the
maximal feasible weight) for the weights of the predictors selected by 
\code{v.special}. Defaults to \code{0}.}
}
\value{
An object of class \code{"mscmt"}, which is essentially a list
containing the results of the estimation and, if applicable, the placebo
study.
The most important list elements are 
\itemize{
\item the weight vector \code{w} for the control units,
\item a matrix \code{v} with weight vectors for the predictors in its 
columns,
\item scalars \code{loss.v} and \code{rmspe} with the dependent loss and its 
square root,
\item a vector \code{loss.w} with the predictor losses corresponding to the
various weight vectors in the columns of \code{v},
\item a matrix \code{predictor.table} containing aggregated statistics of
predictor values (similar to list element \code{tab.pred} of 
function \code{\link[Synth]{synth.tab}} of package \code{'Synth'}),
\item a list of multivariate time series \code{combined} containing, 
for each dependent and predictor variable, a multivariate time series 
with elements \code{treated} for the actual values of the treated unit,
\code{synth} for the synthesized values, and \code{gaps} for the differences.
}
Placebo studies produce a list containing individual results for each 
unit (as treated unit), starting with the original treated unit, as well
as a list element named \code{placebo} with aggregated results for each
dependent and predictor variable.

If \code{times.pred.training} and \code{times.dep.validation} are not 
\code{NULL}, a cross-validation is done and a list of elements \code{cv}
with the results of the cross-validation period and \code{main} with
the results of the main period is returned.
}
\description{
\code{mscmt} performs the Multivariate Synthetic Control Method Using Time 
Series.
}
\details{
\code{mscmt} combines, if necessary, the preparation of the raw data (which 
is expected to be in "list" format, possibly after conversion from a 
\code{\link[base]{data.frame}}
with function \code{\link{listFromLong}}) and the call to the appropriate
MSCMT optimization procedures (depending on the input parameters).
For details on the input parameters \code{alpha}, \code{beta}, and 
\code{gamma}, see [1]. For details on cross-validation, see [2].
}
\examples{
\dontrun{
## for examples, see the package vignettes:
browseVignettes(package="MSCMT")
}
}
\references{
[1] \insertRef{FastReliable}{MSCMT} 

[2] \insertRef{CV}{MSCMT}
}
