% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ModeFilter.R
\name{ModeFilter}
\alias{ModeFilter}
\alias{ModeFilter.default}
\alias{ModeFilter.formula}
\title{Mode Filter}
\usage{
\method{ModeFilter}{formula}(formula, data, ...)

\method{ModeFilter}{default}(x, type = "classical", noiseAction = "repair",
  epsilon = 0.05, maxIter = 100, alpha = 1, beta = 1,
  classColumn = ncol(x), ...)
}
\arguments{
\item{formula}{A formula describing the classification variable and the attributes to be used.}

\item{data, x}{Data frame containing the tranining dataset to be filtered.}

\item{...}{Optional parameters to be passed to other methods.}

\item{type}{Character indicating the scheme to be used. It can be 'classical', 'iterative' or 'weighted'.}

\item{noiseAction}{Character indicating what to do with noisy instances. It can be either 'remove' or 'repair'.}

\item{epsilon}{If 'iterative' type is used, the loop will be stopped if the proportion of modified instances
is less or equal than this threshold.}

\item{maxIter}{Maximum number of iterations in 'iterative' type.}

\item{alpha}{Parameter used in the computation of the similarity between two instances.}

\item{beta}{It regulates the influence of the similarity metric in the estimation
of a new label for an instance.}

\item{classColumn}{positive integer indicating the column which contains the
(factor of) classes. By default, the last column is considered.}
}
\value{
An object of class \code{filter}, which is a list with seven components:
\itemize{
   \item \code{cleanData} is a data frame containing the filtered dataset.
   \item \code{remIdx} is a vector of integers indicating the indexes for
   removed instances (i.e. their row number with respect to the original data frame).
   \item \code{repIdx} is a vector of integers indicating the indexes for
   repaired/relabelled instances (i.e. their row number with respect to the original data frame).
   \item \code{repLab} is a factor containing the new labels for repaired instances.
   \item \code{parameters} is a list containing the argument values.
   \item \code{call} contains the original call to the filter.
   \item \code{extraInf} is a character that includes additional interesting
   information not covered by previous items.
}
}
\description{
Similarity-based filter for removing or repairing label noise from a dataset as a
preprocessing step of classification. For more information, see 'Details' and
'References' sections.
}
\details{
\code{ModeFilter} estimates the most appropriate class for each instance based on the similarity metric
and the provided label. This can be addressed in three different ways (argument 'type'):

In the classical approach, all labels are tried for all instances, and the one maximizing a metric
based on similarity is chosen. In the iterative approach, the same scheme is repeated until the proportion
of modified instances is less than \emph{epsilon} or the maximum number of iterations \emph{maxIter}
is reached. The weighted approach extends the classical one by assigning a weight for each instance, which
quantifies the reliability on its label. This weights is utilized in the computation of the metric to be
maximized.
}
\examples{
# Next example is not run because in some cases it can be rather slow
\dontrun{
   data(iris)
   out <- ModeFilter(Species~., data = iris, type = "classical", noiseAction = "remove")
   print(out)
   identical(out$cleanData, iris[setdiff(1:nrow(iris),out$remIdx),])
}
}
\references{
Du W., Urahama K. (2010, November): Error-correcting semi-supervised pattern
recognition with mode filter on graphs.
In \emph{Aware Computing (ISAC), 2010 2nd International Symposium on} (pp. 6-11). IEEE.
}

