% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/eicm.r
\name{eicm}
\alias{eicm}
\title{Fit and select an Explicit Interaction Community Model (EICM)}
\usage{
eicm(
  occurrences,
  env = NULL,
  traits = NULL,
  intercept = TRUE,
  n.latent = 0,
  rotate.latents = FALSE,
  scale.latents = TRUE,
  forbidden = NULL,
  allowed = NULL,
  mask.sp = NULL,
  exclude.prevalence = 0,
  regularization = c(ifelse(n.latent > 0, 6, 0.5), 1),
  regularization.type = "hybrid",
  penalty = 4,
  theta.threshold = 0.5,
  latent.lambda = 1,
  fit.all.with.latents = TRUE,
  popsize.sel = 2,
  n.cores = parallel::detectCores(),
  parallel = FALSE,
  true.model = NULL,
  do.selection = TRUE,
  do.plots = TRUE,
  fast = FALSE,
  refit.selected = TRUE
)
}
\arguments{
\item{occurrences}{a binary (0/1) sample x species matrix, possibly including NAs.}

\item{env}{an optional sample x environmental variable matrix, for the known environmental predictors.}

\item{traits}{an optional species x trait matrix. Currently, it is only used for excluding
species interactions \emph{a priori}.}

\item{intercept}{logical specifying whether to add a column for the species-level intercepts.}

\item{n.latent}{the number of latent variables to estimate.}

\item{rotate.latents}{logical. Rotate the estimated latent variable values (the values of the
latents at each sample) in the first step with PCA? Defaults to FALSE.}

\item{scale.latents}{logical. Standardize the estimated latent variable values (the values of the
latents at each sample) in the first step? Defaults to TRUE.}

\item{forbidden}{a formula (or list of) defining which species interactions are not to be estimated. See details.
This constraint is cumulative with other constraints (\code{mask.sp} and \code{exclude.prevalence}).}

\item{allowed}{a formula (or list of) defining which species interactions are to be estimated. See details.
This constraint is cumulative with other constraints (\code{mask.sp} and \code{exclude.prevalence}).}

\item{mask.sp}{a scalar or a binary square species x species matrix defining which species interactions to exclude
(0) or include (1) \emph{a priori}. If a scalar (0 or 1), 0 excludes all interactions, 1 allows all interactions.
If a matrix, species in the columns affect species in the rows, so, setting \code{mask.sp[3, 8] <- 0}
means that species #8 is assumed \emph{a priori} to not affect species #3.
This constraint is cumulative with other constraints (\code{forbidden} and \code{exclude.prevalence}).}

\item{exclude.prevalence}{exclude species interactions which are caused by species
with prevalence equal or lower than this value. This constraint is cumulative with
other constraints (\code{forbidden} and \code{mask.sp})}

\item{regularization}{a two-element numeric vector defining the regularization lambdas used for
environmental coefficients and for species interactions respectively. See details.}

\item{regularization.type}{one of "lasso", "ridge" or "hybrid", defining the type of penalty to apply.
Type "hybrid" applies ridge penalty to environmental coefficients and LASSO to interaction coefficients.}

\item{penalty}{the penalty applied to the number of species interactions to include, during variable selection.}

\item{theta.threshold}{exclude species interactions (from network selection) whose preliminary coefficient (in absolute value)
is lower than this value. This exclusion criterion is cumulative with the other user-defined exclusions.}

\item{latent.lambda}{the regularization applied to latent variables and respective coefficients
when estimating their values in samples.}

\item{fit.all.with.latents}{logical. Whether to use the previously estimated latent variables
when estimating the preliminary species interactions.}

\item{popsize.sel}{the population size for the genetic algorithm, expressed as the factor to multiply
by the recommended minimum. Ignored if \code{do.selection=FALSE}.}

\item{n.cores}{the number of CPU cores to use in the variable selection stage and in the optimization.}

\item{parallel}{logical. Whether to use \code{\link[optimParallel]{optimParallel}} during optimizations
instead of \code{\link[stats]{optim}}.}

\item{true.model}{for validation purposes only: the true model that has generated the data, to which
the estimated coefficients will be compared in each selection algorithm iteration.}

\item{do.selection}{logical. Conduct the variable selection stage, over species interaction network topology?}

\item{do.plots}{logical. Plot diagnostic and trace plots?}

\item{fast}{a logical defining whether to do a fast - but less accurate - estimation, or a normal estimation.}

\item{refit.selected}{logical. Refit with exact estimates the best model after network selection? Note that,
for performance reasons, the models fit during the network selection stage use an approximate likelihood.}
}
\value{
A \code{eicm.list} with the following components:
\describe{
  \item{true.model:}{a copy of the \code{true.model} argument.}
  \item{latents.only:}{the model with only the latent variables estimated.}
  \item{fitted.model}{the model with only the species interactions estimated.}
  \item{selected.model:}{the final model with all coefficients estimated, after network topology selection.
                         This is the "best" model given the selection criterion (which depends on
                         \code{regularization} and \code{penalty}.}
}
When accessing the results, remember to pick the model you want (usually, \code{selected.model}).
\code{\link{plot}} automatically picks \code{selected.model} or, if NULL, \code{fitted.model}.
}
\description{
Given species occurrence data and (optionally) measured environmental predictors,
fits and selects an EICM that models species occurrence probability as a function of
measured predictors, unmeasured predictors (latent variables) and direct species interactions.
}
\details{
An Explicit Interaction Community Model (EICM) is a simultaneous equation linear model in which each
species model integrates all the other species as predictors, along with measured and latent variables.

This is the main function for fitting EICM models, and is preferred over using \code{\link{eicm.fit}} directly.

This function conducts the fitting and network topology selection workflow, which includes three stages:
1) estimate latent variable values; 2) make preliminary estimates for species interactions;
3) conduct network topology selection over a reduced model (based on the preliminary estimates).

The selection stage is optional. If not conducted, the species interactions are estimated
(all or a subset according to the user-provided constraints), but not selected.
See \code{vignette("eicm")} for commented examples on a priori excluding interactions.

Missing data in the response matrix is allowed.
}
\examples{
# refer to the vignette for a more detailed explanation
\donttest{
# This can take some time to run

# Load the included parameterized model
data(truemodel)

# make one realization of the model
occurrences <- predict(truemodel, nrepetitions=1)

# Fit and select a model with 2 latent variables to be estimated and all
# interactions possible
m <- eicm(occurrences, n.latent=2, penalty=4, theta.threshold=0.5, n.cores=2)

plot(m)
}
}
\seealso{
\code{\link{eicm-package}}, \code{\link{eicm.fit}}, \code{\link{plot.eicm}}
}
