sphunif
: Uniformity Tests on
the Circle, Sphere, and HypersphereSuppose that you want to test if a sample of circular data is uniformly distributed. For example, the following circular uniform sample in radians:
Then you can call the main function in the sphunif
package, unif_test
, specifying the type
of
test to be performed. For example, the Watson (omnibus) test:
library(sphunif)
#> Loading required package: Rcpp
unif_test(data = cir_data, type = "Watson") # An htest object
#>
#> Watson test of circular uniformity
#>
#> data: cir_data
#> statistic = 0.03701, p-value = 0.8584
#> alternative hypothesis: any alternative to circular uniformity
By default, the calibration of the test statistic is done by Monte
Carlo. This can be changed with p_value = "asymp"
to use
the asymptotic distribution:
unif_test(data = cir_data, type = "Watson", p_value = "MC") # Monte Carlo
#> Loading required package: foreach
#> Loading required package: future
#> Loading required package: rngtools
#>
#> Watson test of circular uniformity
#>
#> data: cir_data
#> statistic = 0.03701, p-value = 0.8662
#> alternative hypothesis: any alternative to circular uniformity
unif_test(data = cir_data, type = "Watson", p_value = "asymp") # Asymp. distr.
#>
#> Watson test of circular uniformity
#>
#> data: cir_data
#> statistic = 0.03701, p-value = 0.8584
#> alternative hypothesis: any alternative to circular uniformity
You can perform several tests within a single call
to unif_test
. Choose the available circular uniformity
tests from
avail_cir_tests
#> [1] "Ajne" "Bakshaev" "Bingham" "Cressie"
#> [5] "CCF09" "FG01" "Gine_Fn" "Gine_Gn"
#> [9] "Gini" "Gini_squared" "Greenwood" "Hermans_Rasson"
#> [13] "Hodges_Ajne" "Kuiper" "Log_gaps" "Max_uncover"
#> [17] "Num_uncover" "PAD" "PCvM" "Poisson"
#> [21] "PRt" "Pycke" "Pycke_q" "Range"
#> [25] "Rao" "Rayleigh" "Riesz" "Rothman"
#> [29] "Sobolev" "Softmax" "Vacancy" "Watson"
#> [33] "Watson_1976"
For example, you can use the Projected Anderson–Darling (García-Portugués, Navarro-Esteban, and Cuesta-Albertos (2020), also an omnibus test) test and the Watson test:
# A *list* of htest objects
unif_test(data = cir_data, type = c("PAD", "Watson"))
#> Series truncated from 10000 to 10000 terms (difference <= 0 with the HBE tail probability; last weight = 1.137e-08).
#> $PAD
#>
#> Projected Anderson-Darling test of circular uniformity
#>
#> data: cir_data
#> statistic = 0.54217, p-value = 0.8403
#> alternative hypothesis: any alternative to circular uniformity
#>
#>
#> $Watson
#>
#> Watson test of circular uniformity
#>
#> data: cir_data
#> statistic = 0.03701, p-value = 0.8584
#> alternative hypothesis: any alternative to circular uniformity
García-Portugués and Verdebout (2018) gives a review of different tests of uniformity on the circle. Section 5.1 in Pewsey and García-Portugués (2021) also gives an overview of recent advances.
Suppose now that you want to test if a sample of spherical data is uniformly distributed. Consider a non-uniformly-generated1 sample of directions in Cartesian coordinates, such as:
# Sample data on S^2
set.seed(987202226)
theta <- runif(n = 30, min = 0, max = 2 * pi)
phi <- runif(n = 30, min = 0, max = pi)
sph_data <- cbind(cos(theta) * sin(phi), sin(theta) * sin(phi), cos(phi))
The available spherical uniformity tests:
avail_sph_tests
#> [1] "Ajne" "Bakshaev" "Bingham" "CCF09" "CJ12"
#> [6] "Gine_Fn" "Gine_Gn" "PAD" "PCvM" "Poisson"
#> [11] "PRt" "Pycke" "Sobolev" "Softmax" "Stereo"
#> [16] "Rayleigh" "Rayleigh_HD" "Riesz"
See again García-Portugués and Verdebout (2018) for a review of tests of uniformity on the sphere and complementary Section 5.1 in Pewsey and García-Portugués (2021).
The default type = "all"
equals
type = avail_sph_tests
, which might be too much for
standard analysis:
unif_test(data = sph_data, type = "all", p_value = "MC", M = 1e3)
#> $Ajne
#>
#> Ajne test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.057937, p-value = 0.996
#> alternative hypothesis: any non-axial alternative to spherical uniformity
#>
#>
#> $Bakshaev
#>
#> Bakshaev (2010) test of spherical uniformity
#>
#> data: sph_data
#> statistic = 1.0215, p-value = 0.623
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $Bingham
#>
#> Bingham test of spherical uniformity
#>
#> data: sph_data
#> statistic = 17.573, p-value = 0.003
#> alternative hypothesis: scatter matrix different from constant
#>
#>
#> $CCF09
#>
#> Cuesta-Albertos et al. (2009) test of spherical uniformity with k = 50
#>
#> data: sph_data
#> statistic = 1.1355, p-value = 0.755
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $CJ12
#>
#> Cai and Jiang (2012) test of spherical uniformity
#>
#> data: sph_data
#> statistic = 19.442, p-value = 0.784
#> alternative hypothesis: unclear consistency
#>
#>
#> $Gine_Fn
#>
#> Gine's Fn test of spherical uniformity
#>
#> data: sph_data
#> statistic = 1.5391, p-value = 0.387
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $Gine_Gn
#>
#> Gine's Gn test of spherical uniformity
#>
#> data: sph_data
#> statistic = 1.3074, p-value = 0.002
#> alternative hypothesis: any axial alternative to spherical uniformity
#>
#>
#> $PAD
#>
#> Projected Anderson-Darling test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.91653, p-value = 0.493
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $PCvM
#>
#> Projected Cramer-von Mises test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.12769, p-value = 0.623
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $Poisson
#>
#> Poisson-kernel test of spherical uniformity with rho = 0.5
#>
#> data: sph_data
#> statistic = 1.4961, p-value = 0.155
#> alternative hypothesis: any alternative to spherical uniformity for rho > 0
#>
#>
#> $PRt
#>
#> Projected Rothman test of spherical uniformity with t = 0.333
#>
#> data: sph_data
#> statistic = 0.15523, p-value = 0.656
#> alternative hypothesis: any alternative to spherical uniformity if t is irrational
#>
#>
#> $Pycke
#>
#> Pycke test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.042839, p-value = 0.304
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $Sobolev
#>
#> Finite Sobolev test of spherical uniformity with vk2 = c(0, 0, 1)
#>
#> data: sph_data
#> statistic = 4.3066, p-value = 0.757
#> alternative hypothesis: alternatives in the spherical harmonics subspace with vk2 ≠ 0
#>
#>
#> $Softmax
#>
#> Softmax test of spherical uniformit with kappa = 1
#>
#> data: sph_data
#> statistic = -0.065236, p-value = 0.485
#> alternative hypothesis: any alternative to spherical uniformity for kappa > 0
#>
#>
#> $Stereo
#>
#> Stereographic projection test of spherical uniformity with a = 0
#>
#> data: sph_data
#> statistic = 3.2993, p-value = 0.157
#> alternative hypothesis: any alternative to spherical uniformity for |a| < 1
#>
#>
#> $Rayleigh
#>
#> Rayleigh test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.13345, p-value = 0.99
#> alternative hypothesis: mean direction different from zero
#>
#>
#> $Rayleigh_HD
#>
#> HD-standardized Rayleigh test of spherical uniformity
#>
#> data: sph_data
#> statistic = -1.1703, p-value = 0.99
#> alternative hypothesis: mean direction different from zero
#>
#>
#> $Riesz
#>
#> Warning! This is an experimental test not meant to be used with s = 1
#>
#> data: sph_data
#> statistic = 1.0215, p-value = 0.623
#> alternative hypothesis: unclear, experimental test
unif_test(data = sph_data, type = "Rayleigh", p_value = "asymp")
#>
#> Rayleigh test of spherical uniformity
#>
#> data: sph_data
#> statistic = 0.13345, p-value = 0.9875
#> alternative hypothesis: mean direction different from zero
The hyperspherical setting is treated analogously to the
spherical setting, and the available tests are exactly the same
(avail_sph_tests
). Here is an example of testing uniformity
with a sample of size 100
on the \(9\)-sphere:
The following snippet partially reproduces the data application in
García-Portugués, Navarro-Esteban, and
Cuesta-Albertos (2021) on testing the uniformity of known
Venusian craters. Incidentally, it also illustrates how to monitor the
progress of a Monte Carlo simulation when p_value = "MC"
using progressr.
# Load spherical data
data(venus)
head(venus)
#> name diameter theta phi
#> 1 Janice 10.0 4.5724136 1.523672
#> 2 HuaMulan 24.0 5.8939769 1.514946
#> 3 Tatyana 19.0 3.7070793 1.490511
#> 4 Landowska 33.0 1.2967796 1.476025
#> 5 Ruslanova 44.3 0.2897247 1.465029
#> 6 Sveta 21.0 4.7684140 1.439181
nrow(venus)
#> [1] 967
# Compute Cartesian coordinates from polar coordinates
venus$X <- cbind(cos(venus$theta) * cos(venus$phi),
sin(venus$theta) * cos(venus$phi),
sin(venus$phi))
# Test uniformity using the Projected Cramér-von Mises and Projected
# Anderson-Darling tests on S^2, both using asymptotic distributions
unif_test(data = venus$X, type = c("PCvM", "PAD"), p_value = "asymp")
#> Series truncated from 10000 to 10000 terms (difference <= 0 with the HBE tail probability; last weight = 6.249e-14).
#> Series truncated from 10000 to 10000 terms (difference <= 0 with the HBE tail probability; last weight = 4.999e-13).
#> $PCvM
#>
#> Projected Cramer-von Mises test of spherical uniformity
#>
#> data: venus$X
#> statistic = 0.25844, p-value = 0.1272
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $PAD
#>
#> Projected Anderson-Darling test of spherical uniformity
#>
#> data: venus$X
#> statistic = 1.5077, p-value = 0.1149
#> alternative hypothesis: any alternative to spherical uniformity
# Define a handler for progressr
require(progress)
#> Loading required package: progress
require(progressr)
#> Loading required package: progressr
handlers(handler_progress(
format = ":spin [:bar] :percent Total: :elapsedfull End \u2248 :eta",
clear = FALSE))
# Test uniformity using Monte-Carlo approximated null distributions
with_progress(
unif_test(data = venus$X, type = c("PCvM", "PAD"),
p_value = "MC", chunks = 1e2, M = 5e2, cores = 2)
)
#> $PCvM
#>
#> Projected Cramer-von Mises test of spherical uniformity
#>
#> data: venus$X
#> statistic = 0.25844, p-value = 0.116
#> alternative hypothesis: any alternative to spherical uniformity
#>
#>
#> $PAD
#>
#> Projected Anderson-Darling test of spherical uniformity
#>
#> data: venus$X
#> statistic = 1.5077, p-value = 0.104
#> alternative hypothesis: any alternative to spherical uniformity
unif_stat
allows to compute several statistics to
different samples within a single call, thus facilitating Monte Carlo
experiments:
# M samples of size n on S^2
samps_sph <- r_unif_sph(n = 30, p = 3, M = 10)
# Apply all statistics to the M samples
unif_stat(data = samps_sph, type = "all")
#> Ajne Bakshaev Bingham CCF09 CJ12 Gine_Fn Gine_Gn
#> 1 0.25642629 1.2740243 3.0921699 1.3219666 18.36313 1.3878146 0.3621095
#> 2 0.07117203 0.6772576 6.9893732 0.9149047 24.98233 0.9265608 0.6418727
#> 3 0.22589028 1.1723079 3.4333269 1.1726659 22.55802 1.3162019 0.4126408
#> 4 0.19451359 0.9883296 2.3956280 1.2898928 26.17551 1.0994251 0.3213708
#> 5 0.34557032 1.7121433 3.4357302 1.4104864 23.14723 1.8216276 0.4393463
#> 6 0.52106149 2.5014190 6.5814801 1.5830810 24.59617 2.6339081 0.5496621
#> 7 0.37727819 1.8175025 3.7137802 1.4601007 21.03363 1.9357469 0.4266342
#> 8 0.17723191 0.8598836 0.1801498 1.2168187 22.37936 0.9713328 0.2624051
#> 9 0.26937836 1.3273645 3.5939142 1.5021719 21.16927 1.4685129 0.3909994
#> 10 0.20070040 1.2037837 6.2135243 1.3720805 24.17573 1.4248372 0.6220356
#> PAD PCvM Poisson PRt Pycke Sobolev
#> 1 0.9395839 0.15925304 -0.57558355 0.2167113 -0.022374854 4.055000
#> 2 0.5945587 0.08465719 -1.21057040 0.1096982 -0.106597617 2.585482
#> 3 0.8936470 0.14653849 -0.41084583 0.1945375 -0.017132971 7.166466
#> 4 0.7531209 0.12354120 -1.29307400 0.1638546 -0.082808626 5.534846
#> 5 1.2281411 0.21401791 0.45232579 0.3046020 0.072236131 3.816622
#> 6 1.7442098 0.31267737 2.48260823 0.4279620 0.177738562 10.600319
#> 7 1.3082009 0.22718781 1.00822132 0.3062035 0.104767601 9.554070
#> 8 0.7010165 0.10748545 -0.62564687 0.1337058 -0.056249730 13.999276
#> 9 0.9970355 0.16592056 0.06934756 0.2155874 0.001478758 11.847208
#> 10 0.9429167 0.15047296 0.17326907 0.2031502 0.025172789 4.733793
#> Softmax Stereo Rayleigh Rayleigh_HD Riesz
#> 1 -0.04452033 0.97559309 3.1198646 0.048934523 1.2740243
#> 2 -0.31080002 -3.43368909 0.4292885 -1.049488573 0.6772576
#> 3 -0.10493392 -0.01949183 2.5215007 -0.195346508 1.1723079
#> 4 -0.18903811 -3.10566918 2.1572087 -0.344068123 0.9883296
#> 5 0.16466927 2.06393929 4.5929816 0.650331995 1.7121433
#> 6 0.61467268 1.59056142 7.1476194 1.693258549 2.5014190
#> 7 0.21878763 4.31581041 4.7983718 0.734182229 1.8175025
#> 8 -0.31301945 -1.07841571 1.4134453 -0.647708254 0.8598836
#> 9 -0.01942868 -0.85602632 3.0045142 0.001842922 1.3273645
#> 10 -0.08055493 1.98899355 2.2218008 -0.317698486 1.2037837
Additionally, unif_stat_MC
is an utility for performing
the previous simulation through a convenient parallel wrapper:
# Break the simulation in 10 chunks of tasks to be divided between 2 cores
sim <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e4, cores = 2,
chunks = 10)
# Critical values for test statistics
sim$crit_val_MC
#> Ajne Bakshaev Bingham CCF09 CJ12 Gine_Fn Gine_Gn PAD
#> 0.1 0.4536825 2.206532 9.128344 1.659555 25.98066 2.349693 0.7576948 1.557836
#> 0.05 0.5391082 2.600789 10.930555 1.774889 26.73748 2.705124 0.8794850 1.794267
#> 0.01 0.7332756 3.431860 14.964214 1.992331 27.95629 3.499274 1.1386400 2.320828
#> PCvM Poisson PRt Pycke Sobolev Softmax Stereo
#> 0.1 0.2758165 2.026086 0.3831610 0.1488173 11.94914 0.4475735 4.789295
#> 0.05 0.3250987 2.835183 0.4533819 0.2139743 13.95774 0.6426233 7.277562
#> 0.01 0.4289825 4.639202 0.6043427 0.3548239 18.42054 1.0694617 14.476883
#> Rayleigh Rayleigh_HD Riesz
#> 0.1 6.254501 1.328645 2.206532
#> 0.05 7.659546 1.902252 2.600789
#> 0.01 10.709374 3.147339 3.431860
# Test statistics
head(sim$stats_MC)
#> Ajne Bakshaev Bingham CCF09 CJ12 Gine_Fn Gine_Gn
#> 1 0.21694040 1.0775492 2.453843 1.161116 25.66338 1.1994472 0.3316856
#> 2 0.12807204 0.7713214 4.378542 1.002687 22.82851 0.9488295 0.4365413
#> 3 0.29297019 1.3318434 1.809964 1.475809 23.17901 1.4254091 0.2535283
#> 4 0.08683535 0.7481899 6.545450 1.050452 16.60092 1.0029567 0.6556153
#> 5 0.23053105 1.5359203 12.536621 1.425933 15.47708 1.9035017 0.9813775
#> 6 0.22807235 1.1421569 3.120667 1.088024 25.84960 1.2559271 0.3436377
#> PAD PCvM Poisson PRt Pycke Sobolev Softmax
#> 1 0.8240941 0.13469365 -0.7001860 0.1761758 -0.05307337 9.007942 -0.15359967
#> 2 0.6362508 0.09641517 -1.3265455 0.1188851 -0.10675328 7.122146 -0.28051417
#> 3 0.9845993 0.16648043 -0.1365134 0.2125231 -0.01783641 15.989324 -0.02079736
#> 4 0.6549005 0.09352374 -0.5870966 0.1193201 -0.07731601 6.343550 -0.29460718
#> 5 1.2113886 0.19199003 1.8253072 0.2476503 0.14422646 6.006292 0.13450569
#> 6 0.8495502 0.14276961 -0.9669085 0.1881344 -0.07465451 6.871270 -0.09297427
#> Stereo Rayleigh Rayleigh_HD Riesz
#> 1 -2.187542 2.3063524 -0.28318046 1.0775492
#> 2 -3.742812 1.0530639 -0.79483333 0.7713214
#> 3 -1.257477 3.2368961 0.09671245 1.3318434
#> 4 -2.709981 0.5177747 -1.01336422 0.7481899
#> 5 9.773694 2.5545518 -0.18185347 1.5359203
#> 6 -4.153019 2.6935451 -0.12510970 1.1421569
# Power computation using a pre-built sampler for the alternative
pow <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e4, cores = 2,
chunks = 10, r_H1 = r_alt, crit_val = sim$crit_val_MC,
alt = "vMF", kappa = 1)
pow$power_MC
#> Ajne Bakshaev Bingham CCF09 CJ12 Gine_Fn Gine_Gn PAD PCvM
#> 0.1 0.8180 0.8181 0.1379 0.7657 0.0945 0.8074 0.1405 0.8114 0.8181
#> 0.05 0.7305 0.7201 0.0786 0.6414 0.0467 0.7143 0.0791 0.7186 0.7201
#> 0.01 0.5083 0.4991 0.0186 0.4061 0.0099 0.4909 0.0182 0.4963 0.4991
#> Poisson PRt Pycke Sobolev Softmax Stereo Rayleigh Rayleigh_HD Riesz
#> 0.1 0.7545 0.8197 0.7903 0.1036 0.8153 0.5764 0.8181 0.8181 0.8181
#> 0.05 0.6495 0.7248 0.6891 0.0501 0.7197 0.4029 0.7275 0.7275 0.7201
#> 0.01 0.4222 0.5061 0.4555 0.0084 0.4954 0.1135 0.5099 0.5099 0.4991
# How to use a custom sampler according to ?unif_stat_MC
r_H1 <- function(n, p, M, l = 1) {
samp <- array(dim = c(n, p, M))
for (j in 1:M) {
samp[, , j] <- mvtnorm::rmvnorm(n = n, mean = c(l, rep(0, p - 1)),
sigma = diag(rep(1, p)))
samp[, , j] <- samp[, , j] / sqrt(rowSums(samp[, , j]^2))
}
return(samp)
}
pow <- unif_stat_MC(n = 30, type = "all", p = 3, M = 1e2, cores = 2,
chunks = 5, r_H1 = r_H1, crit_val = sim$crit_val_MC)
pow$power_MC
#> Ajne Bakshaev Bingham CCF09 CJ12 Gine_Fn Gine_Gn PAD PCvM Poisson PRt
#> 0.1 0.99 0.99 0.47 0.99 0.08 0.99 0.47 0.99 0.99 0.99 0.99
#> 0.05 0.99 0.99 0.35 0.97 0.06 0.99 0.31 0.99 0.99 0.98 0.99
#> 0.01 0.97 0.96 0.17 0.93 0.01 0.96 0.12 0.96 0.96 0.93 0.97
#> Pycke Sobolev Softmax Stereo Rayleigh Rayleigh_HD Riesz
#> 0.1 0.99 0.11 0.99 0.97 0.99 0.99 0.99
#> 0.05 0.99 0.03 0.99 0.90 0.99 0.99 0.99
#> 0.01 0.94 0.01 0.96 0.65 0.97 0.97 0.96
unif_stat_MC
can be used for constructing the Monte
Carlo calibration necessary for unif_test
, either for
providing a rejection rule based on $crit_val_MC
or for
approximating the \(p\)-value by
$stats_MC
.
# Using precomputed critical values
ht1 <- unif_test(data = samps_sph[, , 1], type = "Rayleigh",
p_value = "crit_val", crit_val = sim$crit_val_MC)
ht1
#>
#> Rayleigh test of spherical uniformity
#>
#> data: samps_sph[, , 1]
#> statistic = 3.1199, p-value = NA
#> alternative hypothesis: mean direction different from zero
ht1$reject
#> 0.1 0.05 0.01
#> TRUE TRUE TRUE
# Using precomputed Monte Carlo statistics
ht2 <- unif_test(data = samps_sph[, , 1], type = "Rayleigh",
p_value = "MC", stats_MC = sim$stats_MC)
ht2
#>
#> Rayleigh test of spherical uniformity
#>
#> data: samps_sph[, , 1]
#> statistic = 3.1199, p-value = 0.9998
#> alternative hypothesis: mean direction different from zero
ht2$reject
#> 0.1 0.05 0.01
#> TRUE TRUE TRUE
# Faster than
unif_test(data = samps_sph[, , 1], type = "Rayleigh", p_value = "MC")
#>
#> Rayleigh test of spherical uniformity
#>
#> data: samps_sph[, , 1]
#> statistic = 3.1199, p-value = 0.3739
#> alternative hypothesis: mean direction different from zero
Uniformly-distributed polar coordinates do not translate into uniformly-distributed Cartesian coordinates!↩︎