Help for package quaqcr

Title:

Quick ATAC-Seq QC

Version:

1.0.4

Description:

A wrapper around the 'quaqc' program described in Tremblay and Questa (2024) <doi:10.1093/bioinformatics/btae649>. 'quaqc' allows for assay for transposase-accessible chromatin using sequencing (ATAC-seq) specific quality control and read filtering of next-generation sequencing (NGS) data with minimal processing time and extremely low memory overhead. Any number of samples can be processed, using multiple threads if desired. 'quaqc' outputs a comprehensive set of aligned read metrics, including alignment size, fragment size, percent duplicates, mapq scores, read depth, GC content, and others. Although designed for ATAC-seq data, 'quaqc' can also be used for other unspliced DNA sequencing experiments (such as chromatin immunoprecipitation sequencing, or ChIP-seq) as many of the metrics are related to general sequencing quality. This R package also provides additional utilities for custom analyses and plotting of 'quaqc' results.

URL:

https://github.com/bjmt/quaqcr

BugReports:

https://github.com/bjmt/quaqcr/issues

License:

GPL (≥ 3)

Encoding:

UTF-8

Depends:

R (≥ 3.6.0)

Imports:

methods, utils, jsonlite

RoxygenNote:

7.3.3

Suggests:

testthat (≥ 3.0.0)

Config/testthat/edition:

SystemRequirements:

quaqc (https://github.com/bjmt/quaqc)

NeedsCompilation:

Packaged:

2026-05-15 10:04:43 UTC; gok24jef

Author:

Benjamin Jean-Marie Tremblay

[aut, cre]

Maintainer:

Benjamin Jean-Marie Tremblay <benjmtremblay@gmail.com>

Repository:

CRAN

Date/Publication:

2026-05-19 09:30:02 UTC

quaqcr: Quick ATAC-seq QC in R

Description

A wrapper around the 'quaqc' program alongside additional utilities.

Author(s)

Maintainer: Benjamin Jean-Marie Tremblay benjmtremblay@gmail.com (ORCID)

Get transcription factor footprints with quaqc.

Description

The TSS pileup feature of quaqc can instead be used to get single base resolution transcription factor footprints from ATAC-seq data. This function provides a convenient wrapper around such functionality. Target regions can be compared to unbound or background regions. (Note that no Tn5 bias correction is applied.)

Usage

footprint(target.motifs, bam.files, bkg.motifs = NULL, normalize = c("bkg",
  "rpm", "no"), tss.size = 501, tss.qlen = 1, tss.tn5 = TRUE,
  nfr = TRUE, verbose = 0, ...)

Arguments

target.motifs

Either (1) a filename of a BED file containing target motif positions, or (2) a GRanges object from the GenomicRanges package.

bam.files

Character vector of BAM file names. Must be coordinate sorted. If no index file can be found, quaqc will generate them.

bkg.motifs

Either (1) a filename of a BED file containing target motif positions, or (2) a GRanges object from the GenomicRanges package. (Optional.)

normalize

"bkg": Converts read density into values relative to the background (the first 25% of the window). "rpm": Conver to reads per million. "no": Return as the average number of reads per window.

tss.size

Integer, size of the TSS region for pileup.

tss.qlen

Integer, resize reads (centered on the 5-prime end for pileup.

tss.tn5

Logical, shift 5-prime end coordinates +4/-5 bases for pileup.

nfr

Logical, turn on NFR mode.

verbose

Integer, a value from 0 to 2 for the level of program verbosity.

...

See quaqc().

Value

A data.frame containing read pileup data.

Author(s)

Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com

Examples


bam <- "Sample.bam"
if (nzchar(Sys.which("quaqc")) && file.exists(bam)) {
  TATA_peaks <- system.file("extdata", "tata_p.bed.gz", package = "quaqcr")
  TATA_bkg <- system.file("extdata", "tata_n.bed.gz", package = "quaqcr")
  footprint(TATA_peaks, bam, bkg.motifs = TATA_bkg)
}

Melt sections of a quaqc report into a data.frame.

Description

The quaqc report class type in R is divided into lists of lists, which can require additional manipulation. This function will "melt" these individual sections into data.frame objects.

Usage

melt_reports(report, section = c("bam_stats", "overview_unfilt",
  "overview_filt", "nucl_stats", "nucl_addn", "peak_stats", "tss_stats",
  "tss_pileup", "aln_hist", "frag_hist", "gc_hist", "depth_hist", "genome"),
  use.basename = FALSE, normalize.tss = c("no", "bkg", "rpm"),
  normalize.hist = c("no", "proportion", "max"))

Arguments

report

A quaqc object.

section

The quaqc object subsection to melt.

use.basename

Whether to use the base::basename() function on the sample names.

normalize.tss

How to normalize the TSS pileup. "no": Keep the signal as the average number of reads per window. "bkg": Calculate the signal relative to the background (the first 25% of the window). "rpm": Convert to reads per million.

normalize.hist

How to normalize the alignment size, fragment size, GC percent, and read depth histograms. "no": Keep as the total number of reads per bin. "proportion": Divide by the sum of reads across all windows. "max": Divide by the max bin count.

Value

A data.frame with varying columns based on the section being melted.

Author(s)

Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com

Examples

report.file <- system.file("extdata", "report.json.gz", package = "quaqcr")
report <- parse_quaqc_file(report.file)
melt_reports(report, "overview_filt")

Parse a JSON quaqc report file.

Description

Parse the output of quaqc --json into a easier to use quaqc-class object within R.

Usage

parse_quaqc(json.text)

parse_quaqc_file(json.file)

Arguments

json.text

The JSON report as a character vector.

json.file

The JSON report filename.

Details

A quaqc object is a higher level format encompassing the quaqc run parameters (accessible via ⁠$metadata⁠) and the actual individual reports for each sample (accessible via ⁠$reports⁠). The reports are themselves quaqc_report-class objects with multiple list slots, including:

⁠$sample⁠: The filename of the sample.
⁠$success⁠: Whether the sample was successfully analyzed.
⁠$params⁠: Values for all quaqc parameters used to analyze this sample.
⁠$genome⁠: Data about the genome taken from the BAM header.
⁠$unfiltered⁠: Basic stats about the total number of reads before filtering.
⁠$filtered⁠: Contains the majority of the data output by quaqc.

This final ⁠$filtered⁠ slot itself is broken down into several sub-lists:

⁠$overview⁠: Average values for several stats such as fragment size.
⁠$nuclear$stats⁠: Further breakdown of the previous stats for nuclear reads.
⁠$nuclear$stats.warn⁠: Whether any quaqc parameters prevented it from accurately collecting some data.
⁠$nuclear$addn.stats⁠: Genome coverage and the number of alignments without a MAPQ score.
⁠$nuclear$histograms⁠: Raw histogram data for alignment size, fragment size, GC percent, and read depth.
⁠$nuclear$peaks⁠: The number of peaks, the fraction of the effective genome covered by them, and the FRIP score.
⁠$nuclear$tss⁠: The read pileup around TSSs as well as the TSS enrichment score.

Note that the word 'effective' refers to reads which are visible to quaqc within target regions or outside blacklisted regions, as well as reads associated with any specified target read groups.

Value

A quaqc-class object.

Author(s)

Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com

Examples

report.file <- system.file("extdata", "report.json.gz", package = "quaqcr")

## Option 1: parse a report already read into R
f <- gzfile(report.file, "rt")
json <- jsonlite::fromJSON(readLines(f), simplifyDataFrame = FALSE)
close(f)
report <- parse_quaqc(json)

## Option 2: parse a report directly from a file
report <- parse_quaqc_file(report.file)

Generate read pileups from BAMs with quaqc.

Description

quaqc maintains an internal TSS pileup in order to calcualte a TSS enrichment score. This function takes advantage of this feature to instead generate read pileups for arbitrary sets of regions.

Usage

pileup(target.regions, bam.files, bkg.regions = NULL, normalize = c("rpm",
  "bkg", "no"), region.size = 5001, qlen = 0, verbose = 0, ...)

Arguments

target.regions

Either (1) a filename of a BED file containing target region positions, or (2) a GRanges object from the GenomicRanges package.

bam.files

Character vector of BAM file names. Must be coordinate sorted. If no index file can be found, quaqc will generate them.

bkg.regions

Either (1) a filename of a BED file containing target region positions, or (2) a GRanges object from the GenomicRanges package.

normalize

"bkg": Converts read density into values relative to the background (the first 25% of the window). "rpm": Conver to reads per million. "no": Return as the average number of reads per window.

region.size

The input regions will be uniformly resized to a single size.

qlen

The size of the reads when they are included in the pileup. A qlen of 0 means preserving the original read sizes; otherwise the reads are resized from their 5-prime ends.

verbose

Integer, a value from 0 to 2 for the level of program verbosity.

...

See quaqc().

Value

A data.frame containing read pileup data.

Author(s)

Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com

Examples


bam <- "Sample.bam"
if (nzchar(Sys.which("quaqc")) && file.exists(bam)) {
  peaks <- system.file("extdata", "peaks.bed.gz", package = "quaqcr")
  pileup(peaks, bam)
}

Run quaqc from within R.

Description

Interactive wrapper for running quaqc from within R. For a detailed description of the program, see the manual: execute man quaqc from the command line or open doc/quaqc.1.md in the program folder. For a brief description of the command parameters, as well as to see default values, call quaqc() without any arguments.

Usage

quaqc(bam.files, mitochondria = NULL, plastids = NULL, peaks = NULL,
  tss = NULL, target.names = NULL, target.list = NULL,
  blacklist = NULL, rg.names = NULL, rg.list = NULL,
  use.secondary = FALSE, use.nomate = FALSE, use.dups = FALSE,
  use.chimeric = FALSE, use.dovetails = FALSE, no.se = FALSE,
  mapq = NULL, min.qlen = NULL, min.flen = NULL, max.qlen = NULL,
  max.flen = NULL, use.all = FALSE, max.depth = NULL, max.qhist = NULL,
  max.fhist = NULL, tss.size = NULL, tss.qlen = NULL, tss.tn5 = FALSE,
  omit.gc = FALSE, omit.depth = FALSE, fast = FALSE, lenient = FALSE,
  strict = FALSE, nfr = FALSE, nbr = FALSE, footprint = FALSE,
  chip = FALSE, output.dir = NULL, output.ext = NULL, no.output = TRUE,
  json = "-", keep = FALSE, keep.dir = NULL, keep.ext = NULL,
  threads = NULL, title = NULL, continue = FALSE, verbose = 1,
  timeout = 0, env = character(), stderr.file = "",
  bin = getOption("quaqc.bin"))

Arguments

bam.files

Character vector of BAM file names. Must be coordinate sorted. If no index file can be found, quaqc will generate them.

mitochondria

Character vector of mitochondria names. Provide "" to clear the defaults.

plastids

Character vector of plastid names. Provide "" to clear the defaults.

peaks

Either (1) a filename of a BED file containing peaks, or (2) a GRanges object from the GenomicRanges package.

tss

Either (1) a filename of a BED file containing TSSs, or (2) a GRanges object from the GenomicRanges package.

target.names

Character vector of sequence names to restrict quaqc.

target.list

Either (1) a filename of a BED file containing ranges to restrict quaqc, or (2) a GRanges object from the GenomicRanges package.

blacklist

Either (1) a filename of a BED file containing blacklist ranges, or (2) a GRanges object from the GenomicRanges package.

rg.names

A character vector of read group (RG) names to restrict quaqc.

rg.list

Filename of a text file containing read group (RG) names to restrict quaqc, one name per line.

use.secondary

Logical, allow secondary alignments.

use.nomate

Logical, allow PE reads when the mate does not align properly.

use.dups

Logical, allow duplicate reads.

use.chimeric

Logical, allow supplemental or chimeric alignments.

use.dovetails

Logical, allow dovetailing PE reads.

no.se

Logical, discard SE reads.

mapq

Integer, min MAPQ score.

min.qlen

Integer, min alignment length.

min.flen

Integer, min fragment length.

max.qlen

Integer, max alignment length.

max.flen

Integer, max fragment length.

use.all

Logical, discard all filters and keep all reads.

max.depth

Integer, max base depth for read depth histogram.

max.qhist

Integer, max alignment length for histogram.

max.fhist

Integer, max fragment length for histogram.

tss.size

Integer, size of the TSS region for pileup.

tss.qlen

Integer, resize reads (centered on the 5-prime end for pileup.

tss.tn5

Logical, shift 5-prime end coordinates +4/-5 bases for pileup.

omit.gc

Logical, omit calculation of read GC content.

omit.depth

Logical, omit calculation of read depths.

fast

Logical, turn on fast mode.

lenient

Logical, turn on lenient mode.

strict

Logical, turn on strict mode.

nfr

Logical, turn on NFR mode.

nbr

Logical, turn on NBR mode.

footprint

Logical, turn on footprinting mode.

chip

Logical, turn on ChIP-seq mode.

output.dir

Name of directory to save QC report if not that of input.

output.ext

Filename extension for output files.

no.output

Logical, suppress creation of output QC reports. Note that option is turned on by default when run from quaqcr.

json

Filename of JSON file to save combined QC results to. Set to NULL to suppress this. The default is to pipe the JSON output directly to R and not save to a file.

keep

Logical, save passing nuclear reads to a new BAM file.

keep.dir

Directory name to save filtered BAMs.

keep.ext

Extension of filtered BAMs.

threads

Integer, number of worker threads. Max one per sample.

title

Assign a title to run.

continue

Logical, do not return an error and instead continue running if samples trigger program errors.

verbose

Integer, a value from 0 to 2 for the level of program verbosity.

timeout

Integer, number of seconds before stopping quaqc. By default it is allowed to run indefinitely. See base::system2().

env

A character vector of environment variables to set when running quaqc. See base::system2().

stderr.file

Filename to save quaqc messages. By default they are printed in the console.

bin

Path to quaqc binary. Alternatively, set options(quaqc.bin). If the binary is present in the current working directory, provide "./quaqc". The default, set when quaqcr is loaded, is to assume the binary is present in your PATH.

Value

If nothing is provided to bam.files, then the help message is printed to the console and returned as a character vector, invisibly. Alternatively: if json = NULL then NULL, otherwise the JSON output as parsed by jsonlite.

Author(s)

Benjamin Jean-Marie Tremblay, benjmtremblay@gmail.com

Examples


if (nzchar(Sys.which("quaqc"))) {
  ## To check that you are properly linking to the binary and view help:
  quaqc()
}

Package {quaqcr}

quaqcr: Quick ATAC-seq QC in R

Description

Author(s)

See Also

Get transcription factor footprints with quaqc.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Melt sections of a quaqc report into a data.frame.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Parse a JSON quaqc report file.

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Generate read pileups from BAMs with quaqc.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Run quaqc from within R.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples