Processing math: 100%

Information-theoretical V-measure for Spatial Association

1. The principle of the information-theoretical V-measure

Let us denote the area of the domain as A. Consider two different regionalizations of the domain. To make a further discussion more lucid, we will refer to the first one as a regionalization and to the second one as a partition. The regionalization R divides the domain into n regions rii=1,,n. The partition Z divides the domain into m zones zjj=1,,n. Both R and Z are essentially integer-type vectors with equal elements.

h=1mj=1AjASRjSR

where SR=ni=1AiAlogAiA, SRj=ni=1ai,jAjlogai,jAj, and ai,j represents the count of elements where R==i and Z==j. Ai is the number of elements in the vector where R==i, and Aj is the number of elements in the vector where Z==j.

By swapping R and Z, c can be calculated. Finally, the v-measure can be calculated useing the below formula:

Vβ=(1+β)hc(βh)+c

2. Example

install.packages("itmsa", dep = TRUE)
install.packages("gdverse", dep = TRUE)
library(itmsa)
ntds = gdverse::NTDs
ntds$incidence = sdsfun::discretize_vector(ntds$incidence, 5)
itm(incidence ~ watershed + elevation + soiltype,
    data = ntds, method = "vm")
## # A tibble: 3 × 3
##   Variable     Iv    Pv
##   <chr>     <dbl> <dbl>
## 1 watershed 0.373     0
## 2 elevation 0.365     0
## 3 soiltype  0.213     0