% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/RcppExports.R, R/S3_FowlkesMallowsIndex.R
\name{fmi.factor}
\alias{fmi.factor}
\alias{fmi.cmatrix}
\alias{fmi}
\title{Fowlkes-Mallows Index}
\usage{
\method{fmi}{factor}(actual, predicted, ...)

\method{fmi}{cmatrix}(x, ...)

## Generic S3 method
fmi(...)
}
\arguments{
\item{actual}{A vector of <\link{factor}> with \link{length} \eqn{n}, and \eqn{k} levels}

\item{predicted}{A vector of <\link{factor}> with \link{length} \eqn{n}, and \eqn{k} levels}

\item{...}{Arguments passed into other methods}

\item{x}{A confusion matrix created \code{\link[=cmatrix]{cmatrix()}}}
}
\value{
A <\link{numeric}>-vector of \link{length} 1
}
\description{
The \code{\link[=fmi]{fmi()}}-function computes the \href{https://en.wikipedia.org/wiki/Fowlkes\%E2\%80\%93Mallows_index}{Fowlkes-Mallows Index} (FMI), a measure of the similarity between two sets of clusterings, between
two vectors of predicted and observed \code{\link[=factor]{factor()}} values.
}
\section{Definition}{


The metric is calculated for each class \eqn{k} as follows,

\deqn{
  \sqrt{\frac{\#TP_k}{\#TP_k + \#FP_k} \times \frac{\#TP_k}{\#TP_k + \#FN_k}}
}

Where \eqn{\#TP_k}, \eqn{\#FP_k}, and \eqn{\#FN_k} represent the number of true positives, false positives, and false negatives for each class \eqn{k}, respectively.
}

\section{Creating <\link{factor}>}{


Consider a classification problem with three classes: \code{A}, \code{B}, and \code{C}. The actual vector of \code{\link[=factor]{factor()}} values is defined as follows:

\if{html}{\out{<div class="sourceCode r">}}\preformatted{## set seed
set.seed(1903)

## actual
factor(
  x = sample(x = 1:3, size = 10, replace = TRUE),
  levels = c(1, 2, 3),
  labels = c("A", "B", "C")
)
#>  [1] B A B B A C B C C A
#> Levels: A B C
}\if{html}{\out{</div>}}

Here, the values 1, 2, and 3 are mapped to \code{A}, \code{B}, and \code{C}, respectively. Now, suppose your model does not predict any \code{B}'s. The predicted vector of \code{\link[=factor]{factor()}} values would be defined as follows:

\if{html}{\out{<div class="sourceCode r">}}\preformatted{## set seed
set.seed(1903)

## predicted
factor(
  x = sample(x = c(1, 3), size = 10, replace = TRUE),
  levels = c(1, 2, 3),
  labels = c("A", "B", "C")
)
#>  [1] C A C C C C C C A C
#> Levels: A B C
}\if{html}{\out{</div>}}

In both cases, \eqn{k = 3}, determined indirectly by the \code{levels} argument.
}

\examples{
# 1) recode Iris
# to binary classification
# problem
iris$species_num <- as.numeric(
  iris$Species == "virginica"
)

# 2) fit the logistic
# regression
model <- glm(
  formula = species_num ~ Sepal.Length + Sepal.Width,
  data    = iris,
  family  = binomial(
    link = "logit"
  )
)

# 3) generate predicted
# classes
predicted <- factor(
  as.numeric(
    predict(model, type = "response") > 0.5
  ),
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 3.1) generate actual
# classes
actual <- factor(
  x = iris$species_num,
  levels = c(1,0),
  labels = c("Virginica", "Others")
)

# 4) evaluate model performance
# using Fowlkes Mallows Index
cat(
  "Fowlkes Mallows Index", fmi(
  actual    = actual,
  predicted = predicted
  ),
  sep = "\n"
)
}
\seealso{
Other Classification: 
\code{\link{ROC.factor}()},
\code{\link{accuracy.factor}()},
\code{\link{baccuracy.factor}()},
\code{\link{ckappa.factor}()},
\code{\link{cmatrix.factor}()},
\code{\link{dor.factor}()},
\code{\link{entropy.matrix}()},
\code{\link{fbeta.factor}()},
\code{\link{fdr.factor}()},
\code{\link{fer.factor}()},
\code{\link{fpr.factor}()},
\code{\link{jaccard.factor}()},
\code{\link{logloss.factor}()},
\code{\link{mcc.factor}()},
\code{\link{nlr.factor}()},
\code{\link{npv.factor}()},
\code{\link{plr.factor}()},
\code{\link{pr.auc.matrix}()},
\code{\link{prROC.factor}()},
\code{\link{precision.factor}()},
\code{\link{recall.factor}()},
\code{\link{roc.auc.matrix}()},
\code{\link{specificity.factor}()},
\code{\link{zerooneloss.factor}()}
}
\concept{Classification}
\concept{Unsupervised Learning}
