% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pre_proc_data.R
\name{pre_proc_data}
\alias{pre_proc_data}
\title{Pre-process Data}
\usage{
pre_proc_data(dat1, dat2, norm = TRUE, log = TRUE, center = TRUE)
}
\arguments{
\item{dat1}{The data for the first condition with samples (cells) as columns
and features (genes) as rows.}

\item{dat2}{The data for the second condition with samples (cells) as
columns and features (genes) as rows.}

\item{norm}{This parameter controls whether the data is normalized for
sequencing depth by dividing each column by the total number of reads for
that sample. We recommend that user use one of the many methods for
normalizing scRNA-Seq data and so set this as \code{FALSE}. The default
value is \code{TRUE}}

\item{log}{This parameter controls whether the data is transformed using
\code{log(x + 1)}. The default value is \code{TRUE}.}

\item{center}{This parameter controls whether the data is centered on a gene
by gene basis. We recommend all users center their data prior to applying
SparseDC and only experienced users should set this as \code{FALSE}. The
default value is \code{TRUE}.}
}
\value{
This function returns the two pre-processed datasets stored as a
list
}
\description{
This function pre-process the data so that SparseDC can be applied.
SparseDC requires data that have been normalized for sequencing depth,
log-transformed and centralized on a gene-by-gene basis. For the sequencing
depth normalization we recommend that users use one of the many methods
developed for normalizing scRNA-Seq data prior to using SparseDC and so
can set \code{norm = FALSE}. However, here we normalize the data by dividing
by the total number of reads. This function log transforms the data by
applying \code{log(x + 1)} to each of the data sets. By far the most
important pre-processing step for SparseDC is the centralization of the data.
Having centralized data is a core component of the SparseDC algorithm and is
necessary for both accurate clustering of the cells and identifying marker
genes. We therefore recommend that all users centralize their data using
this function and that only experienced users set \code{center = FALSE}.
}
\examples{
set.seed(10)
# Select small dataset for example
data_test <- data_biase[1:100,]
# Split data into condition A and B
data_A <- data_test[ , which(condition_biase == "A")]
data_B <- data_test[ , which(condition_biase == "B")]
# Pre-process the data
pre_data <- pre_proc_data(data_A, data_B, norm = FALSE, log = TRUE,
center = TRUE)
# Extract Data
pdata_A <- pre_data[[1]]
pdata_B <- pre_data[[2]]

}
