% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/model.R
\name{prepareComputePlan}
\alias{prepareComputePlan}
\title{Return a suitable compute plan for a genome-wide association study}
\usage{
prepareComputePlan(
  model,
  snpData,
  out = "out.log",
  ...,
  SNP = NULL,
  startFrom = 1L
)
}
\arguments{
\item{model}{A fully specified \link[OpenMx:MxModel-class]{MxModel} object that can be fit to each SNP.}

\item{snpData}{A character vector specifying the pathname of a file where the SNP data is stored.}

\item{out}{A character vector containing the pathname where the results of fitted models shall be written.}

\item{...}{Not used.  Forces remaining arguments to be specified by name.}

\item{SNP}{A vector of SNP indices to include in the analysis (e.g. 101:200 will run SNPs at offsets 101 to 200 counting from the beginning of the file); NULL is interpreted as all available SNPs.}

\item{startFrom}{the index to start from when \code{SNP=NULL}}
}
\value{
The given model with an appropriate compute plan.
}
\description{
\lifecycle{maturing}
Instead of using OpenMx's default model processing sequence (i.e.,
\link[OpenMx]{omxDefaultComputePlan}), it is more efficient and
convienient to assemble a compute plan tailored for a genome-wide
association study.  This function returns a compute plan that loads
SNP data into model \code{modelName}, fits the model, outputs the
results to \code{out}, and repeats this procedure for all SNPs.
}
\details{
You can request a specific list of SNPs using the \code{SNP}
argument. The numbers provided in \code{SNP} refer to offsets in
the \code{snpData} file. For example, \code{SNP=c(100,200)} will
process the 100th and 200th SNP. The first SNP in the
\code{snpData} file is at offset 1. When \code{SNP} is omitted then
all available SNPs are processed.

The suffix of \code{snpData} filename is interpreted to signal the
format of how the SNP data is stored on disk. Suffixes
\sQuote{pgen}, \sQuote{bed}, and \sQuote{bgen} are supported.
Per-SNP descriptions are found in different places depending on the
suffix. For \sQuote{bgen}, both the SNP data and description are
built into the same file. In the case of \sQuote{pgen}, an
associated file with suffix \sQuote{pvar} is expected to exist (see
the
\href{https://www.cog-genomics.org/plink/2.0/formats#pvar}{spec}
for details). In the case of \sQuote{bed}, an associated
\sQuote{bim} file is expected to exist (see the
\href{https://www.cog-genomics.org/plink2/formats#bim}{spec} for
details). The chromosome, base-pair coordinate, and variant ID are
added to each line of \code{out}.

The code to implement method='pgen' is based on plink 2.0
alpha. plink's \sQuote{bed} file format is supported in addition
to \sQuote{pgen}. Data are coerced appropriately depending on the
type of the destination column. For a numeric column, data are
recorded as the values NA, 0, 1, or 2. An ordinal column must have
exactly 3 levels.

For \code{method='bgen'}, the file \code{path+".bgi"} must also
exist. If not available, generate this index file with the
\href{https://bitbucket.org/gavinband/bgen/wiki/bgenix}{bgenix}
tool.

For \sQuote{bgen} and \sQuote{pgen} formats, the numeric column can be
populated with a dosage (sum of probabilities multiplied by genotypes)
if these data are available.

A compute plan does not do anything by itself. You'll need to combine
the compute plan with a model (such as returned by \link{buildOneFac})
to perform a GWAS.
}
\examples{
m1 <- mxModel("test", mxFitFunctionWLS())
dir <- system.file("extdata", package = "gwsem")
m1 <- prepareComputePlan(m1, file.path(dir,"example.pgen"))
m1$compute
}
\seealso{
\link{GWAS}
}
