% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/phylosem.R
\name{phylosem}
\alias{phylosem}
\title{Fit phylogenetic structural equation model}
\usage{
phylosem(
  sem,
  tree,
  data,
  family = rep("fixed", ncol(data)),
  covs = colnames(data),
  estimate_ou = FALSE,
  estimate_lambda = FALSE,
  estimate_kappa = FALSE,
  data_labels = rownames(data),
  quiet = FALSE,
  newtonsteps = 1,
  tmb_inputs = NULL,
  run_model = TRUE,
  ...
)
}
\arguments{
\item{sem}{structural equation model structure, passed to either \code{\link[sem]{specifyModel}}
or \code{\link[sem]{specifyEquations}} and then parsed to control
the set of path coefficients and variance-covariance parameters}

\item{tree}{phylogenetic structure, using class \code{\link[ape]{as.phylo}}}

\item{data}{data-frame providing variables being modeled.  Missing values are inputted
as NA.  If an SEM includes a latent variable (i.e., variable with no available measurements)
then it still must be inputted as a column of \code{data} with entirely NA values.}

\item{family}{Character-vector listing the distribution used for each column of \code{data}, where
each element must be \code{fixed}, \code{normal}, \code{binomial}, or \code{poisson}.
\code{family="fixed"} is default behavior and assumes that a given variable is measured exactly.
Other options correspond to different specifications of measurement error.}

\item{covs}{optional: a character vector of one or more elements, with each element
  	giving a string of variable names, separated by commas. Variances and covariances
  	among all variables in each such string are added to the model. For confirmatory
  	factor analysis models specified via \code{cfa}, \code{covs} defaults to all of
  	the factors in the model, thus specifying all variances and covariances among these factors.
  	\emph{Warning}: \code{covs="x1, x2"} and \code{covs=c("x1", "x2")} are \emph{not}
  	equivalent: \code{covs="x1, x2"} specifies the variance of \code{x1}, the variance
  	of \code{x2}, \emph{and} their covariance, while \code{covs=c("x1", "x2")} specifies
  	the variance of \code{x1} and the variance of \code{x2} \emph{but not} their covariance.}

\item{estimate_ou}{Boolean indicating whether to estimate an autoregressive (Ornstein-Uhlenbeck)
process using additional parameter \code{lnalpha},
corresponding to the \code{model="OUrandomRoot"} parameterization from \pkg{phylolm}
as listed in \doi{10.1093/sysbio/syu005}}

\item{estimate_lambda}{Boolean indicating whether to estimate additional branch lengths for
phylogenetic tips (a.k.a. the Pagel-lambda term) using additional parameter \code{logitlambda}}

\item{estimate_kappa}{Boolean indicating whether to estimate a nonlinear scaling of branch
lengths (a.k.a. the Pagel-kappa term) using additional parameter \code{lnkappa}}

\item{data_labels}{For each row of \code{data}, listing the corresponding name from
\code{tree$tip.label}.  Default pulls \code{data_labels} from \code{rownames(data)}}

\item{quiet}{if \code{FALSE}, the default, then the number of input lines is reported and
    a message is printed suggesting that \code{specifyEquations} or \code{cfa} be used.}

\item{newtonsteps}{Integer specifying the number of extra newton steps to take
after optimization (alternative to \code{loopnum}).
Each newtonstep requires calculating the Hessian matrix and is therefore slow.
But for well-behaved models, each Newton step will typically
decrease the maximum gradient of the loglikelihood with respect to each fixed effect,
and therefore this option can be used to achieve an arbitrarily low final gradient
given sufficient time for well-behaved models.  However, this option will also
perform strangely or have unexpected consequences for poorly-behaved models, e.g.,
when fixed effects are at upper or lower bounds.}

\item{tmb_inputs}{optional tagged list that overrides the default constructor
for TMB inputs (use at your own risk)}

\item{run_model}{Boolean indicating whether to estimate parameters (the default), or
instead to return the model inputs and compiled TMB object without running;}

\item{...}{Additional parameters passed to \code{\link{fit_tmb}}}
}
\value{
An object (list) of class `phylosem`. Elements include:
\describe{
\item{data}{Copy of argument \code{data}}
\item{SEM_model}{SEM model parsed from \code{sem} using \code{\link[sem]{specifyModel}} or \code{\link[sem]{specifyEquations}}}
\item{obj}{TMB object from \code{\link[TMB]{MakeADFun}}}
\item{tree}{Copy of argument \code{tree}}
\item{tmb_inputs}{The list of inputs passed to \code{\link[TMB]{MakeADFun}}}
\item{opt}{The output from \code{\link{fit_tmb}}}
\item{report}{The output from \code{obj$report()}}
\item{parhat}{The output from \code{obj$env$parList()} containing maximum likelihood estimates and empirical Bayes predictions}
}
}
\description{
Fits a phylogenetic structural equation model
}
\details{
Note that parameters \code{logitlambda}, \code{lnkappa}, and \code{lnalpha} if estimated are each estimated as having a single value
     that applies to all modeled variables.
     This differs from default behavior in \pkg{phylolm}, where these parameters only apply to the "response" and not "predictor" variables.
     This also differs from default behavior in \pkg{phylopath}, where a different value is estimated
     in each call to \pkg{phylolm} during the d-separation estimate of path coefficients. However, it is
     consistent with default behavior in \pkg{Rphylopars}, and estimates should be comparable in that case.
     These additional parameters are estimated with unbounded support, which differs somewhat from default
     bounded estimates in \pkg{phylolm}, although parameters should match if overriding \pkg{phylolm} defaults
     to use unbounded support.  Finally, \code{phylosem} allows these three parameters to be estimated in any
     combination, which is expanded functionality relative to the single-option functionality in \pkg{phylolm}.

Also note that \pkg{phylopath} by default uses standardized coefficients.  To achieve matching parameter estimates between
     \pkg{phylosem} and \pkg{phylopath}, standardize each variable to have a standard deviation of 1.0 prior to fitting with \pkg{phylosem}.
}
\examples{
# Load data set
data(rhino, rhino_tree, package="phylopath")

# Run phylosem
model = "
  DD -> RS, p1
  BM -> LS, p2
  BM -> NL, p3
  NL -> DD, p4
"
psem = phylosem( sem = model,
          data = rhino[,c("BM","NL","DD","RS","LS")],
          tree = rhino_tree )

# Convert and plot using phylopath
library(phylopath)
my_fitted_DAG = as_fitted_DAG(psem)
coef_plot( my_fitted_DAG )
plot( my_fitted_DAG )

# Convert to phylo4d
my_phylo4d = as_phylo4d(psem)

# Convert to sem
library(sem)
my_sem = as_sem(psem)
pathDiagram( model = my_sem,
                  style = "traditional",
                  edge.labels = "values" )
effects( my_sem )

# Convert and plot using semPlot
library(semPlot)
myplot = semPlotModel( my_sem )
semPaths( my_sem,
                   nodeLabels = myplot@Vars$name )

# Convert and plot using phylosignal
library(phylosignal)
dotplot( my_phylo4d )
gridplot( my_phylo4d )

# Cluster based on phylogeny and traits
gC = graphClust( my_phylo4d,
                 lim.phylo = 5,
                 lim.trait = 5,
                 scale.lim = FALSE)
plot(gC, which = "graph", ask = FALSE)

}
\references{
**Introducing the package, its features, and comparison with other software
(to cite when using phylosem):**

Thorson, J. T., & van der Bijl, W. (In revision). phylosem: A fast and simple
R package for phylogenetic inference and trait imputation using phylogenetic
structural equation models.

*Statistical methods for phylogenetic structural equation models*

Thorson, J. T., Maureaud, A. A., Frelat, R., Merigot, B., Bigman, J. S., Friedman,
S. T., Palomares, M. L. D., Pinsky, M. L., Price, S. A., & Wainwright, P. (2023).
Identifying direct and indirect associations among traits by merging phylogenetic
comparative methods and structural equation models. Methods in Ecology and Evolution,
14(5), 1259-1275. \doi{10.1111/2041-210X.14076}

*Earlier development of computational methods, originally used for phlogenetic factor analysis:*

Thorson, J. T. (2020). Predicting recruitment density dependence and intrinsic growth rate for all fishes
worldwide using a data-integrated life-history model. Fish and Fisheries, 21(2),
237-251. \doi{10.1111/faf.12427}

Thorson, J. T., Munch, S. B., Cope, J. M., & Gao, J. (2017). Predicting life
history parameters for all fishes worldwide. Ecological Applications, 27(8),
2262-2276. \doi{10.1002/eap.1606}

*Earlier development of phylogenetic path analysis:*

van der Bijl, W. (2018). phylopath: Easy phylogenetic path analysis in
R. PeerJ, 6, e4718. \doi{10.7717/peerj.4718}

von Hardenberg, A., & Gonzalez-Voyer, A. (2013). Disentangling
evolutionary cause-effect relationships with phylogenetic confirmatory
path analysis. Evolution; International Journal of Organic Evolution,
67(2), 378-387. \doi{10.1111/j.1558-5646.2012.01790.x}

*Interface involving SEM `arrow notation` is repurposed from:*

Fox, J., Nie, Z., & Byrnes, J. (2020). Sem: Structural equation models.
R package version 3.1-11. \url{https://CRAN.R-project.org/package=sem}

*Coercing output to phylo4d depends upon:*

Bolker, B., Butler, M., Cowan, P., de Vienne, D., Eddelbuettel, D., Holder, M.,
Jombart, T., Kembel, S., Michonneau, F., & Orme, B. (2015). phylobase:
Base package for phylogenetic structures and comparative data. R Package Version 0.8.0.
\url{https://CRAN.R-project.org/package=phylobase}

*Laplace approximation for parameter estimation depends upon:*

Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., & Bell, B. M. (2016).
TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software,
70(5), 1-21. \doi{10.18637/jss.v070.i05}
}
