% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ggm_compare_bf.default.R
\name{ggm_compare_explore}
\alias{ggm_compare_explore}
\title{GGM Compare: Exploratory Hypothesis Testing}
\usage{
ggm_compare_explore(
  ...,
  formula = NULL,
  type = "continuous",
  mixed_type = NULL,
  analytic = FALSE,
  prior_sd = 0.2,
  iter = 5000,
  progress = TRUE,
  seed = 1
)
}
\arguments{
\item{...}{At least two matrices (or data frame) of dimensions \emph{n} (observations) by  \emph{p} (variables).}

\item{formula}{An object of class \code{\link[stats]{formula}}. This allows for including
control variables in the model (i.e., \code{~ gender}).}

\item{type}{Character string. Which type of data for \code{Y} ? The options include \code{continuous},
\code{binary}, or \code{ordinal}. See the note for further details.}

\item{mixed_type}{Numeric vector. An indicator of length p for which varibles should be treated as ranks.
(1 for rank and 0 to assume normality). The default is currently (dev version) to treat all integer variables
as ranks when \code{type = "mixed"} and \code{NULL} otherwise. See note for further details.}

\item{analytic}{logical. Should the analytic solution be computed (default is \code{FALSE}) ? See note for details.}

\item{prior_sd}{Numeric. The scale of the prior distribution (centered at zero), in reference to a beta distribtuion.
The `default` is 0.20. See note for further details.}

\item{iter}{number of iterations (posterior samples; defaults to 5000).}

\item{progress}{Logical. Should a progress bar be included (defaults to \code{TRUE}) ?}

\item{seed}{An integer for the random seed.}
}
\value{
The returned object of class \code{ggm_compare_explore} contains a lot of information that
        is used for printing and plotting the results. For users of \strong{BGGM}, the following
        are the useful objects:

\itemize{

\item \code{BF_01} A \emph{p} by \emph{p} matrix including
                    the Bayes factor for the null hypothesis.

\item \code{pcor_diff} A \emph{p} by \emph{p} matrix including
                       the difference in partial correlations (only for two groups).

\item \code{samp} A list containing the fitted models (of class \code{explore}) for each group.

}
}
\description{
Compare Gaussian graphical models with exploratory hypothesis testing using the matrix-F prior
distribution \insertCite{Mulder2018}{BGGM}. A test for each partial correlation in the model for any number
of groups. This provides evidence for the null hypothesis of no difference and the alternative hypothesis
of difference. With more than two groups, the test is for \emph{all} groups simultaneously (i.e., the relation
is the same or different in all groups). This method was introduced in \insertCite{williams2020comparing;textual}{BGGM}.
For confirmatory hypothesis testing see \code{confirm_groups}.
}
\details{
\strong{Controlling for Variables}:

When controlling for variables, it is assumed that \code{Y} includes \emph{only}
the nodes in the GGM and the control variables. Internally, \code{only} the predictors
that are included in \code{formula} are removed from \code{Y}. This is not behavior of, say,
\code{\link{lm}}, but was adopted to ensure  users do not have to write out each variable that
should be included in the GGM. An example is provided below.

\strong{Mixed Type}:

 The term "mixed" is somewhat of a misnomer, because the method can be used for data including \emph{only}
 continuous or \emph{only} discrete variables. This is based on the ranked likelihood which requires sampling
 the ranks for each variable (i.e., the data is not merely transformed to ranks). This is computationally
 expensive when there are many levels. For example, with continuous data, there are as many ranks
 as data points!

 The option \code{mixed_type} allows the user to determine  which variable should be treated as ranks
 and the "emprical" distribution is used otherwise. This is accomplished by specifying an indicator
 vector of length \emph{p}. A one indicates to use the ranks, whereas a zero indicates to "ignore"
 that variable. By default all integer variables are handled as ranks.

\strong{Dealing with Errors}:

An error is most likely to arise when \code{type = "ordinal"}. The are two common errors (although still rare):

\itemize{

\item The first is due to sampling the thresholds, especially when the data is heavily skewed.
      This can result in an ill-defined matrix. If this occurs, we recommend to first try
      decreasing \code{prior_sd} (i.e., a more informative prior). If that does not work, then
      change the data type to \code{type = mixed} which then estimates a copula GGM
      (this method can be used for data containing \strong{only} ordinal variable). This should
      work without a problem.

\item  The second is due to how the ordinal data are categorized. For example, if the error states
       that the index is out of bounds, this indicates that the first category is a zero. This is not allowed, as
       the first category must be one. This is addressed by adding one (e.g., \code{Y + 1}) to the data matrix.

}
}
\note{
\strong{"Default" Prior}:

 In Bayesian statistics, a default Bayes factor needs to have several properties. I refer
 interested users to \insertCite{@section 2.2 in @dablander2020default;textual}{BGGM}. In
 \insertCite{Williams2019_bf;textual}{BGGM}, some of these propteries were investigated, such
 model selection consistency. That said, we would not consider this a "default" Bayes factor and
 thus we encourage users to perform sensitivity analyses by varying the scale of the prior
 distribution.

 Furthermore, it is important to note there is no "correct" prior and, also, there is no need
 to entertain the possibility of a "true" model. Rather, the Bayes factor can be interpreted as
 which hypothesis best (relative to each other) predicts the observed data
 \insertCite{@Section 3.2 in @Kass1995}{BGGM}.

\strong{Interpretation of Conditional (In)dependence Models for Latent Data}:

See \code{\link{BGGM-package}} for details about interpreting GGMs based on latent data
(i.e, all data types besides \code{"continuous"})
}
\examples{

\donttest{
# note: iter = 250 for demonstrative purposes

# data
Y <- bfi

# males and females
Ymale <- subset(Y, gender == 1,
                   select = -c(gender,
                               education))[,1:10]

Yfemale <- subset(Y, gender == 2,
                     select = -c(gender,
                                 education))[,1:10]

##########################
### example 1: ordinal ###
##########################

# fit model
fit <- ggm_compare_explore(Ymale,  Yfemale,
                           type = "ordinal",
                           iter = 250,
                           progress = FALSE)
# summary
summ <- summary(fit)

# edge set
E <- select(fit)
}

}
\references{
\insertAllCited{}
}
