% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/semLearn.R
\name{SEMdag}
\alias{SEMdag}
\title{Estimate the optimal DAG from an input graph}
\usage{
SEMdag(
  graph,
  data,
  LO = "TO",
  beta = 0,
  lambdas = NA,
  penalty = TRUE,
  verbose = FALSE,
  ...
)
}
\arguments{
\item{graph}{An igraph object.}

\item{data}{A matrix whith n rows corresponding to subjects, and p columns
to graph nodes (variables).}

\item{LO}{character for linear order method. If LO="TO" the topological
order of the input DAG is enabled (default), while LO="TD" the data-driven
top-down minimum conditional variance method is performed.}

\item{beta}{Numeric value. Minimum absolute LASSO beta coefficient for
a new interaction to be retained in the final model. By default,
\code{beta = 0}.}

\item{lambdas}{A vector of regularization LASSO lambda values. If lambdas is
NULL, the \code{\link[glmnet]{glmnet}} default using cross-validation lambdas
is enabled. If lambdas is NA (default), the tuning-free scheme is enabled by
fixing lambdas = sqrt(log(p)/n), as suggested by Janková and van de Geer
(2015), when LO="TO" or lambdas = 2/sqrt(n) * qnorm(1 - 0.2/(2*p*(p-1))), as suggested by
Shojaie and Michailidis (2010), when LO="TD". This will both reduce
computational time and provide the same result at each run.}

\item{penalty}{A logical value. Separate penalty factors can be applied to
each coefficient. This is a number that multiplies lambda to allow differential
shrinkage. Can be 0 for some variables, which implies no shrinkage, and that
variable is always included in the model. If TRUE (default) weights are based
on the graph edges: 0 (i.e., edge present) and 1 (i.e., missing edge) ensures
that the input edges will be retained in the final model. If FALSE the
\code{\link[glmnet]{glmnet}} default is enabled (all weights equal to 1). Note:
the penalty factors are internally rescaled to sum p (the number of variables).}

\item{verbose}{A logical value. If FALSE (default), the processed graphs
will not be plotted to screen.}

\item{...}{Currently ignored.}
}
\value{
A list of 3 igraph objects:
\enumerate{
\item "dag", the estimated DAG;
\item "dag.new", new estimated connections;
\item "dag.old", connections preserved from the input graph.
}
}
\description{
Extract the optimal DAG from an input graph, using
topological order or top-down order search and LASSO-based algorithm, implemented in
\code{\link[glmnet]{glmnet}}.
}
\details{
The optimal DAG is estimated using the order search approach. First
a linear order of p nodes is determined, and from this sort, the DAG can be
learned using successive penalized (L1) regressions (Shojaie and Michailidis,
2010). The estimate linear order are obtained from \emph{a priori} graph
topological order (TO), or with a data-driven high dimensional top-down (TD)
approach (best subset regression), assuming a SEM whose error terms have equal
variances (Peters and Bühlmann, 2014; Chen et al, 2019). If the input graph is
not acyclic, a warning message will be raised, and a cycle-breaking algorithm
will be applied (see \code{\link[SEMgraph]{graph2dag}} for details).
Output DAG edges will be colored in gray, if they were present in the
input graph, and in green, if they are new edges generated by LASSO
screening.
}
\examples{

# DAG estimation
G <- SEMdag(graph = sachs$graph, data = log(sachs$pkc), beta = 0.05)

# Model fitting
sem <- SEMrun(graph = G$dag, data = log(sachs$pkc), group = sachs$group)

# Graphs
old.par <- par(no.readonly = TRUE)
par(mfrow=c(2,2), mar=rep(1,4))
plot(sachs$graph, layout=layout.circle, main="input graph")
plot(G$dag, layout=layout.circle, main = "Output DAG")
plot(G$dag.old, layout=layout.circle, main = "Inferred old edges")
plot(G$dag.new, layout=layout.circle, main = "Inferred new edges")
par(old.par)

}
\references{
Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J,
Tibshirani RJ (2012). Strong rules for discarding predictors in
lasso type problems. Royal Statistical Society: Series B
(Statistical Methodology), 74(2): 245-266.
<https://doi.org/10.1111/j.1467-9868.2011.01004.x>

Shojaie A, Michailidis G (2010). Penalized likelihood methods for
estimation of sparse high-dimensional directed acyclic graphs.
Biometrika, 97(3): 519-538. <https://doi.org/10.1093/biomet/asq038>

Jankova J, van de Geer S (2015). Confidence intervals for high-dimensional
inverse covariance estimation. Electronic Journal of Statistics,
9(1): 1205-1229. <https://doi.org/10.1214/15-EJS1031>

Peters J, Bühlmann P (2014). Identifiability of Gaussian structural equation
models with equal error variances. Biometrika, 101(1):219–228.

Chen W, Drton M, Wang YS (2019). On Causal Discovery with an Equal-Variance
Assumption. Biometrika, 106(4): 973-980.
}
\seealso{
\code{\link[SEMgraph]{modelSearch}}
}
\author{
Mario Grassi \email{mario.grassi@unipv.it}
}
