% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/get_data_extracts.R
\name{get_data_extracts}
\alias{get_data_extracts}
\title{Collect data extracts from a validation step}
\usage{
get_data_extracts(agent, i = NULL)
}
\arguments{
\item{agent}{\emph{The pointblank agent object}

\verb{obj:<ptblank_agent>} // \strong{required}

A \strong{pointblank} \emph{agent} object that is commonly created through the use of
the \code{\link[=create_agent]{create_agent()}} function. It should have had \code{\link[=interrogate]{interrogate()}} called on
it, such that the validation steps were carried out and any sample rows
from non-passing validations could potentially be available in the object.}

\item{i}{\emph{A validation step number}

\verb{scalar<integer>} // \emph{default:} \code{NULL} (\code{optional})

The validation step number, which is assigned to each validation step by
\strong{pointblank} in the order of definition. If \code{NULL} (the default), all
data extract tables will be provided in a list object.}
}
\value{
A list of tables if \code{i} is not provided, or, a standalone table if
\code{i} is given.
}
\description{
In an agent-based workflow (i.e., initiating with \code{\link[=create_agent]{create_agent()}}), after
interrogation with \code{\link[=interrogate]{interrogate()}}, we can extract the row data that didn't
pass row-based validation steps with the \code{get_data_extracts()} function.
There is one discrete extract per row-based validation step and the amount of
data available in a particular extract depends on both the fraction of test
units that didn't pass the validation step and the level of sampling or
explicit collection from that set of units. These extracts can be collected
programmatically through \code{get_data_extracts()} but they may also be
downloaded as CSV files from the HTML report generated by the agent's print
method or through the use of \code{\link[=get_agent_report]{get_agent_report()}}.

The availability of data extracts for each row-based validation step depends
on whether \code{extract_failed} is set to \code{TRUE} within the \code{\link[=interrogate]{interrogate()}} call
(it is by default). The amount of \emph{fail} rows extracted depends on the
collection parameters in \code{\link[=interrogate]{interrogate()}}, and the default behavior is to
collect up to the first 5000 \emph{fail} rows.

Row-based validation steps are based on those validation functions of the
form \verb{col_vals_*()} and also include \code{\link[=conjointly]{conjointly()}} and \code{\link[=rows_distinct]{rows_distinct()}}.
Only functions from that combined set of validation functions can yield data
extracts.
}
\section{Examples}{


Create a series of two validation steps focused on testing row values for
part of the \code{small_table} object. Use \code{\link[=interrogate]{interrogate()}} right after that.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{agent <-
  create_agent(
    tbl = small_table \%>\%
      dplyr::select(a:f),
    label = "`get_data_extracts()`"
  ) \%>\%
  col_vals_gt(d, value = 1000) \%>\%
  col_vals_between(
    columns = c,
    left = vars(a), right = vars(d),
    na_pass = TRUE
  ) \%>\%
  interrogate()
}\if{html}{\out{</div>}}

Using \code{get_data_extracts()} with its defaults returns of a list of tables,
where each table is named after the validation step that has an extract
available.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{agent \%>\% get_data_extracts()
}\if{html}{\out{</div>}}

\preformatted{## $`1`
## # A tibble: 6 × 6
##       a b             c     d e     f
##   <int> <chr>     <dbl> <dbl> <lgl> <chr>
## 1     8 3-ldm-038     7  284. TRUE  low
## 2     7 1-knw-093     3  843. TRUE  high
## 3     3 5-bce-642     9  838. FALSE high
## 4     3 5-bce-642     9  838. FALSE high
## 5     4 2-dmx-010     7  834. TRUE  low
## 6     2 7-dmx-010     8  108. FALSE low
##
## $`2`
## # A tibble: 4 × 6
##       a b             c     d e     f
##   <int> <chr>     <dbl> <dbl> <lgl> <chr>
## 1     6 8-kdg-938     3 2343. TRUE  high
## 2     8 3-ldm-038     7  284. TRUE  low
## 3     7 1-knw-093     3  843. TRUE  high
## 4     4 5-boe-639     2 1036. FALSE low}



We can get an extract for a specific step by specifying it in the \code{i}
argument. Let's get the failing rows from the first validation step (the
\code{\link[=col_vals_gt]{col_vals_gt()}} one).

\if{html}{\out{<div class="sourceCode r">}}\preformatted{agent \%>\% get_data_extracts(i = 1)
}\if{html}{\out{</div>}}

\preformatted{## # A tibble: 6 × 6
##       a b             c     d e     f
##   <int> <chr>     <dbl> <dbl> <lgl> <chr>
## 1     8 3-ldm-038     7  284. TRUE  low
## 2     7 1-knw-093     3  843. TRUE  high
## 3     3 5-bce-642     9  838. FALSE high
## 4     3 5-bce-642     9  838. FALSE high
## 5     4 2-dmx-010     7  834. TRUE  low
## 6     2 7-dmx-010     8  108. FALSE low}
}

\section{Function ID}{

8-2
}

\seealso{
Other Post-interrogation: 
\code{\link{all_passed}()},
\code{\link{get_agent_x_list}()},
\code{\link{get_sundered_data}()},
\code{\link{write_testthat_file}()}
}
\concept{Post-interrogation}
