| Title: | Quick ATAC-Seq QC |
| Version: | 1.0.4 |
| Description: | A wrapper around the 'quaqc' program described in Tremblay and Questa (2024) <doi:10.1093/bioinformatics/btae649>. 'quaqc' allows for assay for transposase-accessible chromatin using sequencing (ATAC-seq) specific quality control and read filtering of next-generation sequencing (NGS) data with minimal processing time and extremely low memory overhead. Any number of samples can be processed, using multiple threads if desired. 'quaqc' outputs a comprehensive set of aligned read metrics, including alignment size, fragment size, percent duplicates, mapq scores, read depth, GC content, and others. Although designed for ATAC-seq data, 'quaqc' can also be used for other unspliced DNA sequencing experiments (such as chromatin immunoprecipitation sequencing, or ChIP-seq) as many of the metrics are related to general sequencing quality. This R package also provides additional utilities for custom analyses and plotting of 'quaqc' results. |
| URL: | https://github.com/bjmt/quaqcr |
| BugReports: | https://github.com/bjmt/quaqcr/issues |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| Depends: | R (≥ 3.6.0) |
| Imports: | methods, utils, jsonlite |
| RoxygenNote: | 7.3.3 |
| Suggests: | testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| SystemRequirements: | quaqc (https://github.com/bjmt/quaqc) |
| NeedsCompilation: | no |
| Packaged: | 2026-05-15 10:04:43 UTC; gok24jef |
| Author: | Benjamin Jean-Marie Tremblay
|
| Maintainer: | Benjamin Jean-Marie Tremblay <benjmtremblay@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-05-19 09:30:02 UTC |
quaqcr: Quick ATAC-seq QC in R
Description
A wrapper around the 'quaqc' program alongside additional utilities.
Author(s)
Maintainer: Benjamin Jean-Marie Tremblay benjmtremblay@gmail.com (ORCID)
See Also
Useful links:
Get transcription factor footprints with quaqc.
Description
The TSS pileup feature of quaqc can instead be used to get single base resolution transcription factor footprints from ATAC-seq data. This function provides a convenient wrapper around such functionality. Target regions can be compared to unbound or background regions. (Note that no Tn5 bias correction is applied.)
Usage
footprint(target.motifs, bam.files, bkg.motifs = NULL, normalize = c("bkg",
"rpm", "no"), tss.size = 501, tss.qlen = 1, tss.tn5 = TRUE,
nfr = TRUE, verbose = 0, ...)
Arguments
target.motifs |
Either (1) a filename of a BED file containing
target motif positions, or (2) a |
bam.files |
Character vector of BAM file names. Must be coordinate
sorted. If no index file can be found, |
bkg.motifs |
Either (1) a filename of a BED file containing
target motif positions, or (2) a |
normalize |
"bkg": Converts read density into values relative to the background (the first 25% of the window). "rpm": Conver to reads per million. "no": Return as the average number of reads per window. |
tss.size |
Integer, size of the TSS region for pileup. |
tss.qlen |
Integer, resize reads (centered on the 5-prime end for pileup. |
tss.tn5 |
Logical, shift 5-prime end coordinates +4/-5 bases for pileup. |
nfr |
Logical, turn on NFR mode. |
verbose |
Integer, a value from 0 to 2 for the level of program verbosity. |
... |
See |
Value
A data.frame containing read pileup data.
Author(s)
Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com
See Also
Examples
bam <- "Sample.bam"
if (nzchar(Sys.which("quaqc")) && file.exists(bam)) {
TATA_peaks <- system.file("extdata", "tata_p.bed.gz", package = "quaqcr")
TATA_bkg <- system.file("extdata", "tata_n.bed.gz", package = "quaqcr")
footprint(TATA_peaks, bam, bkg.motifs = TATA_bkg)
}
Melt sections of a quaqc report into a data.frame.
Description
The quaqc report class type in R is divided into lists of lists,
which can require additional manipulation. This function will
"melt" these individual sections into data.frame objects.
Usage
melt_reports(report, section = c("bam_stats", "overview_unfilt",
"overview_filt", "nucl_stats", "nucl_addn", "peak_stats", "tss_stats",
"tss_pileup", "aln_hist", "frag_hist", "gc_hist", "depth_hist", "genome"),
use.basename = FALSE, normalize.tss = c("no", "bkg", "rpm"),
normalize.hist = c("no", "proportion", "max"))
Arguments
report |
A |
section |
The |
use.basename |
Whether to use the |
normalize.tss |
How to normalize the TSS pileup. "no": Keep the signal as the average number of reads per window. "bkg": Calculate the signal relative to the background (the first 25% of the window). "rpm": Convert to reads per million. |
normalize.hist |
How to normalize the alignment size, fragment size, GC percent, and read depth histograms. "no": Keep as the total number of reads per bin. "proportion": Divide by the sum of reads across all windows. "max": Divide by the max bin count. |
Value
A data.frame with varying columns based on the
section being melted.
Author(s)
Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com
See Also
Examples
report.file <- system.file("extdata", "report.json.gz", package = "quaqcr")
report <- parse_quaqc_file(report.file)
melt_reports(report, "overview_filt")
Parse a JSON quaqc report file.
Description
Parse the output of quaqc --json into a easier to use
quaqc-class object within R.
Usage
parse_quaqc(json.text)
parse_quaqc_file(json.file)
Arguments
json.text |
The JSON report as a character vector. |
json.file |
The JSON report filename. |
Details
A quaqc object is a higher level format encompassing the
quaqc run parameters (accessible via $metadata) and the actual
individual reports for each sample (accessible via $reports). The
reports are themselves quaqc_report-class objects with
multiple list slots, including:
-
$sample: The filename of the sample. -
$success: Whether the sample was successfully analyzed. -
$params: Values for all quaqc parameters used to analyze this sample. -
$genome: Data about the genome taken from the BAM header. -
$unfiltered: Basic stats about the total number of reads before filtering. -
$filtered: Contains the majority of the data output by quaqc.
This final $filtered slot itself is broken down into several sub-lists:
-
$overview: Average values for several stats such as fragment size. -
$nuclear$stats: Further breakdown of the previous stats for nuclear reads. -
$nuclear$stats.warn: Whether any quaqc parameters prevented it from accurately collecting some data. -
$nuclear$addn.stats: Genome coverage and the number of alignments without a MAPQ score. -
$nuclear$histograms: Raw histogram data for alignment size, fragment size, GC percent, and read depth. -
$nuclear$peaks: The number of peaks, the fraction of the effective genome covered by them, and the FRIP score. -
$nuclear$tss: The read pileup around TSSs as well as the TSS enrichment score.
Note that the word 'effective' refers to reads which are visible to quaqc within target regions or outside blacklisted regions, as well as reads associated with any specified target read groups.
Value
A quaqc-class object.
Author(s)
Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com
See Also
Examples
report.file <- system.file("extdata", "report.json.gz", package = "quaqcr")
## Option 1: parse a report already read into R
f <- gzfile(report.file, "rt")
json <- jsonlite::fromJSON(readLines(f), simplifyDataFrame = FALSE)
close(f)
report <- parse_quaqc(json)
## Option 2: parse a report directly from a file
report <- parse_quaqc_file(report.file)
Generate read pileups from BAMs with quaqc.
Description
quaqc maintains an internal TSS pileup in order to calcualte a TSS enrichment score. This function takes advantage of this feature to instead generate read pileups for arbitrary sets of regions.
Usage
pileup(target.regions, bam.files, bkg.regions = NULL, normalize = c("rpm",
"bkg", "no"), region.size = 5001, qlen = 0, verbose = 0, ...)
Arguments
target.regions |
Either (1) a filename of a BED file containing
target region positions, or (2) a |
bam.files |
Character vector of BAM file names. Must be coordinate
sorted. If no index file can be found, |
bkg.regions |
Either (1) a filename of a BED file containing
target region positions, or (2) a |
normalize |
"bkg": Converts read density into values relative to the background (the first 25% of the window). "rpm": Conver to reads per million. "no": Return as the average number of reads per window. |
region.size |
The input regions will be uniformly resized to a single size. |
qlen |
The size of the reads when they are included in the
pileup. A |
verbose |
Integer, a value from 0 to 2 for the level of program verbosity. |
... |
See |
Value
A data.frame containing read pileup data.
Author(s)
Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com
See Also
Examples
bam <- "Sample.bam"
if (nzchar(Sys.which("quaqc")) && file.exists(bam)) {
peaks <- system.file("extdata", "peaks.bed.gz", package = "quaqcr")
pileup(peaks, bam)
}
Run quaqc from within R.
Description
Interactive wrapper for running quaqc from within R. For a detailed
description of the program, see the manual: execute man quaqc from the
command line or open doc/quaqc.1.md in the program folder. For a brief
description of the command parameters, as well as to see default values,
call quaqc() without any arguments.
Usage
quaqc(bam.files, mitochondria = NULL, plastids = NULL, peaks = NULL,
tss = NULL, target.names = NULL, target.list = NULL,
blacklist = NULL, rg.names = NULL, rg.list = NULL,
use.secondary = FALSE, use.nomate = FALSE, use.dups = FALSE,
use.chimeric = FALSE, use.dovetails = FALSE, no.se = FALSE,
mapq = NULL, min.qlen = NULL, min.flen = NULL, max.qlen = NULL,
max.flen = NULL, use.all = FALSE, max.depth = NULL, max.qhist = NULL,
max.fhist = NULL, tss.size = NULL, tss.qlen = NULL, tss.tn5 = FALSE,
omit.gc = FALSE, omit.depth = FALSE, fast = FALSE, lenient = FALSE,
strict = FALSE, nfr = FALSE, nbr = FALSE, footprint = FALSE,
chip = FALSE, output.dir = NULL, output.ext = NULL, no.output = TRUE,
json = "-", keep = FALSE, keep.dir = NULL, keep.ext = NULL,
threads = NULL, title = NULL, continue = FALSE, verbose = 1,
timeout = 0, env = character(), stderr.file = "",
bin = getOption("quaqc.bin"))
Arguments
bam.files |
Character vector of BAM file names. Must be coordinate
sorted. If no index file can be found, |
mitochondria |
Character vector of mitochondria names. Provide |
plastids |
Character vector of plastid names. Provide |
peaks |
Either (1) a filename of a BED file containing peaks, or (2) a
|
tss |
Either (1) a filename of a BED file containing TSSs, or (2) a
|
target.names |
Character vector of sequence names to restrict |
target.list |
Either (1) a filename of a BED file containing ranges
to restrict |
blacklist |
Either (1) a filename of a BED file containing blacklist
ranges, or (2) a |
rg.names |
A character vector of read group (RG) names to restrict
|
rg.list |
Filename of a text file containing read group (RG) names to
restrict |
use.secondary |
Logical, allow secondary alignments. |
use.nomate |
Logical, allow PE reads when the mate does not align properly. |
use.dups |
Logical, allow duplicate reads. |
use.chimeric |
Logical, allow supplemental or chimeric alignments. |
use.dovetails |
Logical, allow dovetailing PE reads. |
no.se |
Logical, discard SE reads. |
mapq |
Integer, min MAPQ score. |
min.qlen |
Integer, min alignment length. |
min.flen |
Integer, min fragment length. |
max.qlen |
Integer, max alignment length. |
max.flen |
Integer, max fragment length. |
use.all |
Logical, discard all filters and keep all reads. |
max.depth |
Integer, max base depth for read depth histogram. |
max.qhist |
Integer, max alignment length for histogram. |
max.fhist |
Integer, max fragment length for histogram. |
tss.size |
Integer, size of the TSS region for pileup. |
tss.qlen |
Integer, resize reads (centered on the 5-prime end for pileup. |
tss.tn5 |
Logical, shift 5-prime end coordinates +4/-5 bases for pileup. |
omit.gc |
Logical, omit calculation of read GC content. |
omit.depth |
Logical, omit calculation of read depths. |
fast |
Logical, turn on fast mode. |
lenient |
Logical, turn on lenient mode. |
strict |
Logical, turn on strict mode. |
nfr |
Logical, turn on NFR mode. |
nbr |
Logical, turn on NBR mode. |
footprint |
Logical, turn on footprinting mode. |
chip |
Logical, turn on ChIP-seq mode. |
output.dir |
Name of directory to save QC report if not that of input. |
output.ext |
Filename extension for output files. |
no.output |
Logical, suppress creation of output QC reports. Note that option is turned on by default when run from quaqcr. |
json |
Filename of JSON file to save combined QC results to. Set to
|
keep |
Logical, save passing nuclear reads to a new BAM file. |
keep.dir |
Directory name to save filtered BAMs. |
keep.ext |
Extension of filtered BAMs. |
threads |
Integer, number of worker threads. Max one per sample. |
title |
Assign a title to run. |
continue |
Logical, do not return an error and instead continue running if samples trigger program errors. |
verbose |
Integer, a value from 0 to 2 for the level of program verbosity. |
timeout |
Integer, number of seconds before stopping quaqc. By default
it is allowed to run indefinitely. See |
env |
A character vector of environment variables to set when
running quaqc. See |
stderr.file |
Filename to save quaqc messages. By default they are printed in the console. |
bin |
Path to quaqc binary. Alternatively, set |
Value
If nothing is provided to bam.files, then the help message is
printed to the console and returned as a character vector, invisibly.
Alternatively: if json = NULL then NULL, otherwise the JSON
output as parsed by jsonlite.
Author(s)
Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com
See Also
base::system2(), parse_quaqc(), parse_quaqc_file()
Examples
if (nzchar(Sys.which("quaqc"))) {
## To check that you are properly linking to the binary and view help:
quaqc()
}