% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/Sequoia_Main.R
\name{sequoia}
\alias{sequoia}
\title{Pedigree Reconstruction}
\usage{
sequoia(GenoM = NULL, LifeHistData = NULL, SeqList = NULL,
  MaxSibIter = 10, Err = 1e-04, MaxMismatch = 3, Tfilter = -2,
  Tassign = 0.5, MaxSibshipSize = 100, DummyPrefix = c("F", "M"),
  Complex = "full", FindMaybeRel = TRUE, CalcLLR = TRUE, quiet = FALSE)
}
\arguments{
\item{GenoM}{numeric matrix with genotype data: One row per individual, and
one column per SNP, coded as 0, 1, 2 or -9 (missing). Use
\code{\link{GenoConvert}} to convert genotype files created in PLINK using
--recodeA or in Colony's 2-column format to this format.}

\item{LifeHistData}{Dataframe with 3 columns:
\itemize{
\item{ID: }{max. 30 characters long,}
\item{Sex: }{1 = females, 2 = males, other = unkown, except 4 = hermaphrodite,}
\item{BY: }{(birth or hatching year) Negative numbers are
  interpreted as missing values.}}
If the species has multiple generations per year, use an integer coding
 such that the candidate parents' `Birth year' is at least one larger than
 their putative offspring.}

\item{SeqList}{list with output from a previous run, containing the elements
`Specs', `AgePriors' and/or `PedigreePar', as described below, to be used
 in the current run. If SeqList$Specs is provided, all other input
 parameter values except MaxSibIter are ignored.}

\item{MaxSibIter}{Number of iterations of sibship clustering, including
assignment of grandparents to sibships and avuncular relationships between
sibships. Set to 0 to not (yet) perform this step, which is by far the
most time consuming and may take several hours for large datasets.
clustering is iterated until convergence or until MaxSibIter is reached.}

\item{Err}{Estimated genotyping error rate. The error model aims to deal
with scoring errors typical for SNP arrays.}

\item{MaxMismatch}{Maximum number of loci at which candidate parent and
offspring are allowed to be opposite homozygotes.}

\item{Tfilter}{Threshold log10-likelihood ratio (LLR) between a proposed
relationship versus unrelated, to select candidate relatives. Typically a
negative value, related to the fact that unconditional likelihoods are
calculated during the filtering steps. More negative values may decrease
non-assignment, but will increase computational time.}

\item{Tassign}{Minimum LLR required for acceptance of
proposed relationship, relative to next most likely relationship. Higher
values result in more conservative assignments. Must be zero or positive.}

\item{MaxSibshipSize}{Maximum number of offspring for a single individual
(a generous safety margin is advised).}

\item{DummyPrefix}{character vector of length 2 with prefixes for dummy
dams (mothers) and sires (fathers); maximum 20 characters each.}

\item{Complex}{Either "full" (default), "simp" (simplified, no explicit
consideration of inbred relationships; not fully implemented yet) or
"mono" (monogamous).}

\item{FindMaybeRel}{Identify pairs of non-assigned likely relatives after
pedigree reconstruction. Can be time-consuming in large datasets.}

\item{CalcLLR}{Calculate log-likelihood ratios for all assignments. Can be
time-consuming in large datasets.}

\item{quiet}{suppress messages.}
}
\value{
A list with some or all of the following components:
\item{AgePriors}{Matrix with age-difference based prior probabilities.}
\item{DummyIDs}{Dataframe with pedigree for dummy individuals, as well as
  their sex, estimated birth year (point estimate, upper and lower bound of
  95\% confidence interval), number of offspring, and offspring IDs.}
\item{DupGenoID}{Dataframe, rownumbers of duplicated IDs in genotype file.
  Please do remove or relabel these to avoid downstream confusion.}
\item{DupGenotype}{Dataframe, duplicated genotypes (with or without
  identical IDs). The specified number of maximum mismatches is allowed,
  and this dataframe may include pairs of closely related individuals.}
\item{DupLifeHistID}{Dataframe, rownumbers of duplicated IDs in life
  history dataframe.}
\item{LifeHist}{Provided dataframe with sex and birth year data.}
\item{MaybeParent}{Dataframe with pairs of individuals who are more likely
  parent-offspring than unrelated, but which could not be phased due to
  unknown age difference (coded as 999) or sex, or for whom LLR did not pass
  Tassign.}
\item{MaybeRel}{Dataframe with pairs of individuals who are more likely
  to be first or second degree relatives than unrelated, but which could not
  be assigned.}
\item{NoLH}{Vector, IDs in genotype data for which no life history data is
 provided.}
\item{Pedigree}{Dataframe with assigned genotyped and dummy parents from
  Sibship step; entries for dummy individuals are added at the bottom.}
\item{PedigreePar}{Dataframe with assigned parents from Parentage step.}
\item{Specs}{Named vector with parameter values.}
\item{TotLikParents}{Numeric vector, Total likelihood of the genotype data
  at initiation and after each iteration during Parentage.}
\item{TotLikSib}{Numeric vector, Total likelihood of the genotype data
  at initiation and after each iteration during Sibship clustering.}


List elements PedigreePar and Pedigree both have the following columns:
 \item{id}{Individual ID}
 \item{dam}{Assigned mother, or NA}
 \item{sire}{Assigned father, or NA}
 \item{LLRdam}{Log10-Likelihood Ratio (LLR) of this female being the mother,
   versus the next most likely relationship between the focal individual and
   this female (see Details for relationships considered)}
 \item{LLRsire}{idem, for male parent}
 \item{LLRpair}{LLR for the parental pair, versus the next most likely
  configuration between the three individuals (with one or neither parent
  assigned)}
In addition, PedigreePar has the columns
 \item{OHdam}{Number of loci at which the offspring and mother are
   opposite homozygotes}
 \item{OHsire}{idem, for father}
}
\description{
Perform pedigree reconstruction based on SNP data, including parentage
assignment and sibship clustering.
}
\details{
For each pair of candidate relatives, the likelihoods are calculated of them
 being parent-offspring (PO), full siblings (FS), half siblings (HS),
 grandparent-grandoffspring (GG), full avuncular (niece/nephew - aunt/uncle;
 FA), half avuncular/great-grandparental/cousins (HA), or unrelated (U).
 Assignments are made if the likelihood ratio (LLR) between the focal
 relationship and the most likely alternative exceed the threshold Tassign.

Further explanation of the various options and interpretation of the output
 is provided in the vignette.
}
\examples{
data(SimGeno_example, LH_HSg5, package="sequoia")
head(SimGeno_example[,1:10])
head(LH_HSg5)
SeqOUT <- sequoia(GenoM = SimGeno_example,
                  LifeHistData = LH_HSg5, MaxSibIter = 0)
names(SeqOUT)
SeqOUT$PedigreePar[34:42, ]
\dontrun{
SeqOUT2 <- sequoia(GenoM = SimGeno_example,
                  LifeHistData = LH_HSg5, MaxSibIter = 10)
SeqOUT2$Pedigree[34:42, ]

# reading in data from text files:
GenoM <- as.matrix(read.table("MyGenoData.txt", row.names=1, header=FALSE))
LH <- read.table("MyLifeHistData.txt", header=TRUE)
MySeqOUT <- sequoia(GenoM = GenoM, LifeHistData = LH)
}
}
\author{
Jisca Huisman, \email{jisca.huisman@gmail.com}
}
\references{
Huisman, J. Pedigree reconstruction from SNP data: Parentage
  assignment, sibship clustering, and beyond. (accepted manuscript) Molecular
  Ecology Resources
}
\seealso{
\code{\link{GenoConvert}, \link{SimGeno}, \link{PedCompare}}, vignette("sequoia")
}

