% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/impute_missing.R
\name{impute_missing}
\alias{impute_missing}
\title{Replace Missing Values}
\usage{
impute_missing(
  data,
  method = list(dplyr::where(is.numeric) ~ "mean", dplyr::where(is.character) ~ "mode",
    dplyr::where(is.factor) ~ "mode"),
  filter_by = NULL,
  drop_all_na = FALSE,
  verbose = TRUE
)
}
\arguments{
\item{data}{A data frame. The dataset in which missing values should be imputed.}

\item{method}{A list of one-sided formulas of the form \verb{<selector> ~ <value>}.
Supported \verb{<value>} options are:
\itemize{
\item \code{"mean"}: replace with the column mean (numeric columns only).
\item \code{"median"}: replace with the column median (numeric columns only).
\item \code{"mode"}: replace with the most frequent value (works for numeric, character, or factor).
\item A numeric constant: replace with that constant (numeric columns).
\item A character constant: replace with that value (character/factor columns).
\item A function: a function \verb{function(col)} that receives the column and returns a single value to be used as replacement for NA.
}
The default is \code{list(dplyr::where(is.numeric) ~ "mean",dplyr::where(is.character) ~ "mode",dplyr::where(is.factor) ~ "mode")}.}

\item{filter_by}{Character vector of column names. If provided, only rows that have \strong{all} specified columns non-NA are kept (applied \emph{before} imputation).}

\item{drop_all_na}{Logical; if \code{TRUE}, rows where \strong{all} columns are \code{NA} are removed \emph{before} imputation.}

\item{verbose}{Logical; if \code{TRUE} (default) print a concise final summary of what was imputed. Set to \code{FALSE} to suppress messages.}
}
\value{
A tibble with missing values replaced according to the provided specifications.
}
\description{
Replace missing values (NA) in a data.frame with a specified value or method (such as mean, median, mode, constant, or custom function),
applying imputation column-wise.
}
\details{
You can remove rows that are entirely \code{NA} before imputation using
\code{drop_all_na}, or filter rows based on specific variables using \code{filter_by}.
\itemize{
\item The \code{method} argument uses \strong{tidyselect} helpers. For example, \code{where(is.numeric()) ~ "median"}
imputes all numeric columns by their medians.
\item \code{"mode"} works for numeric, character and factor columns.
\item When imputing factors with a character constant, the constant is added as a new level if needed.
\item When passing a custom function, it should return at least one value; if multiple values are returned, only the first is used (with a warning).
}
}
\examples{
# Impute all numeric columns by their means:
impute_missing(icu)

# Impute numeric columns by median:
impute_missing(
  icu,
  method = list(where(is.numeric) ~ "median")
)

# Keep only rows where both "vent_mec_no_inv" and "vent_mec" are non-missing:
impute_missing(
  icu,
  filter_by = c("vent_mec_no_inv", "vent_mec")
)
}
