The taxodist distance matrix is a standard
dist object and integrates directly with the R ecosystem
for multivariate analysis. This vignette shows how to use it alongside
vegan and ape for ecological and phylogenetic
analyses. All examples use a set of threatened mammals from Brazil
spanning five broad taxonomic groups.
taxa_brazil <- c(
"Priodontes", "Myrmecophaga", "Chrysocyon", "Tapirus", "Didelphis",
"Leontopithecus", "Brachyteles",
"Panthera", "Pteronura", "Puma",
"Sotalia", "Pontoporia",
"Trichechus", "Mazama", "Blastocerus"
)The dataset covers xenarthrans (Priodontes,
Myrmecophaga), a marsupial (Didelphis),
ungulates (Tapirus, Mazama,
Blastocerus), primates (Leontopithecus,
Brachyteles), carnivores (Chrysocyon,
Panthera, Pteronura, Puma),
cetaceans (Sotalia, Pontoporia), and a
sirenian (Trichechus).
The pairwise distance matrix is computed with a single call. Lineages are cached after the first retrieval, so subsequent analyses on the same taxa are instantaneous.
library(taxodist)
mat <- distance_matrix(taxa_brazil)
print(mat)
#> Priodontes Myrmecophaga Chrysocyon Tapirus Didelphis
#> Myrmecophaga 0.01639344
#> Chrysocyon 0.01694915 0.01694915
#> Tapirus 0.01694915 0.01694915 0.01587302
#> Didelphis 0.01754386 0.01754386 0.01754386 0.01754386
#> Leontopithecus 0.01694915 0.01694915 0.01666667 0.01666667 0.01754386
#> Brachyteles 0.01694915 0.01694915 0.01666667 0.01666667 0.01754386
#> Panthera 0.01694915 0.01694915 0.01492537 0.01587302 0.01754386
#> Pteronura 0.01694915 0.01694915 0.01470588 0.01587302 0.01754386
#> Puma 0.01694915 0.01694915 0.01492537 0.01587302 0.01754386
#> Sotalia 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386
#> Pontoporia 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386
#> Trichechus 0.01666667 0.01666667 0.01694915 0.01694915 0.01754386
#> Mazama 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386
#> Blastocerus 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386Didelphis shows the largest distances to all other taxa
(0.01754), consistent with the early divergence of marsupials from
placental mammals. Carnivores (Panthera,
Pteronura, Puma) show smaller distances among
themselves, reflecting a more recent common ancestor within
Carnivora.
apetaxo_cluster() performs hierarchical clustering on the
distance matrix and returns an object that ape can convert
directly to a phylo tree for publication-quality
visualisation.
library(ape)
cl <- taxo_cluster(taxa_brazil)
tree <- ape::as.phylo(cl$hclust)
plot(tree,
main = "Threatened Mammals of Brazil",
cex = 0.85,
tip.color = "gray20")The phylo object is fully compatible with all
ape tree manipulation and plotting functions, including
nodelabels(), edgelabels(), and export to
Newick format via write.tree().
vegantaxondive)vegan::taxondive() computes a family of taxonomic
diversity indices (Clarke and Warwick 1998, 2001) from a community
matrix and a taxonomic distance matrix. These indices capture not just
species richness but how distantly related the species in a community
are: a community of closely related species scores lower than one with
the same richness but spanning multiple orders.
The function expects a community matrix (sites x species) and the
dist object from distance_matrix(). Here we
define three hypothetical communities, each dominated by one broad
taxonomic group.
set.seed(123)
library(vegan)
comm <- matrix(c(
1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, # xenarthrans + marsupial
0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, # primates + carnivores
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 # cetaceans + sirenian + ungulates
), nrow = 3, byrow = TRUE)
colnames(comm) <- taxa_brazil
rownames(comm) <- c("community_A", "community_B", "community_C")
vegan::taxondive(comm, mat)
#> Species Delta Delta* Lambda+ Delta+ S Delta+
#> community_A 3.0000e+00 1.7160e-02 1.7160e-02 2.9410e-07 1.7160e-02 0.0515
#> community_B 5.0000e+00 1.5905e-02 1.5905e-02 8.8325e-07 1.5905e-02 0.0795
#> community_C 5.0000e+00 1.5750e-02 1.5750e-02 1.1834e-06 1.5750e-02 0.0788
#> Expected -9.9265e-02 1.6544e-02 1.6457e-02 The key indices are:
Delta (\(\Delta\)): mean taxonomic distance between all pairs of species in the community.
Delta* (\(\Delta^*\)): mean pairwise distance excluding self-comparisons; comparable across communities of different sizes.
Delta+ (\(\Delta^+\)): mean pairwise distance weighted by species abundances. For presence-absence data this equals \(\Delta^*\).
Lambda+ (\(\Lambda^+\)): variance in taxonomic distances, measuring how evenly a community spans the taxonomic tree.
community_A, dominated by xenarthrans and a marsupial,
scores the highest \(\Delta\)
(0.01673), reflecting deeper divergences among its members.
community_B (primates and carnivores) scores the lowest
(0.01591), as its members share more recent common ancestors within
Placentalia.
veganThe Mantel test assesses the correlation between two distance matrices by permutation. A typical application is testing whether taxonomic distance correlates with geographic distance, a signature of phylogeographic structure.
Here we use simulated coordinates for illustration. In practice,
replace coords with observed latitude and longitude
values.
set.seed(42)
coords <- matrix(rnorm(30), ncol = 2)
rownames(coords) <- taxa_brazil
geo_dist <- dist(coords)
vegan::mantel(mat, geo_dist)
#> Mantel statistic based on Pearson's product-moment correlation
#>
#> Call:
#> vegan::mantel(xdis = mat, ydis = geo_dist)
#>
#> Mantel statistic r: -0.0569
#> Significance: 0.653
#>
#> Upper quantiles of permutations (null model):
#> 90% 95% 97.5% 99%
#> 0.188 0.237 0.272 0.336
#> Permutation: free
#> Number of permutations: 999The non-significant result (r = -0.27, p = 0.972) is expected with
random coordinates. For non-normal distance distributions,
"spearman" is more robust than the default
"pearson".
vegan::mantel(mat, geo_dist, method = "spearman", permutations = 9999)
#> Mantel statistic based on Spearman's rank correlation rho
#>
#> Call:
#> vegan::mantel(xdis = mat, ydis = geo_dist, method = "spearman",
#> permutations = 9999)
#>
#> Mantel statistic r: -0.07405
#> Significance: 0.672
#>
#> Upper quantiles of permutations (null model):
#> 90% 95% 97.5% 99%
#> 0.189 0.244 0.293 0.353
#> Permutation: free
#> Number of permutations: 9999veganPERMANOVA (Anderson 2001) tests whether groups of taxa differ significantly in taxonomic distance space. It partitions total variance in the distance matrix into within-group and between-group components and assesses significance by permutation, with no assumption of multivariate normality.
groups <- c(
"xenarthra", "xenarthra", "carnivore", "ungulate", "marsupial",
"primate", "primate",
"carnivore", "carnivore", "carnivore",
"cetacean", "cetacean",
"sirenian", "ungulate", "ungulate"
)
vegan::adonis2(mat ~ groups)
#> Permutation test for adonis under reduced model
#> Permutation: free
#> Number of permutations: 999
#>
#> vegan::adonis2(formula = mat ~ groups)
#> Df SumOfSqs R2 F Pr(>F)
#> Model 6 0.00100049 0.52646 1.4823 0.001 ***
#> Residual 8 0.00089992 0.47354
#> Total 14 0.00190041 1.00000
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1The taxonomic grouping explains \(49.6\%\) of the total variance in the distance matrix (\(\text{R}^2 = 0.496\), \(\text{p} = 0.001\)). This confirms that the broad mammalian orders occupy distinct regions of taxonomic distance space, consistent with their deep evolutionary divergences.
For pairwise comparisons between groups after a significant
PERMANOVA, see pairwise.adonis2() in the
pairwiseAdonis package.
The following script chains all steps from taxon names to multivariate analysis.
library(taxodist)
library(vegan)
library(ape)
# 1. define taxa
taxa_brazil <- c(
"Priodontes", "Myrmecophaga", "Chrysocyon", "Tapirus", "Didelphis",
"Leontopithecus", "Brachyteles",
"Panthera", "Pteronura", "Puma",
"Sotalia", "Pontoporia",
"Trichechus", "Mazama", "Blastocerus"
)
# 2. compute distance matrix
mat <- distance_matrix(taxa_brazil)
# 3. hierarchical clustering and phylo plot
cl <- taxo_cluster(taxa_brazil)
tree <- ape::as.phylo(cl$hclust)
plot(tree, main = "Threatened Mammals of Brazil", cex = 0.85)
# 4. PCoA ordination
ord <- taxo_ordinate(mat)
plot(ord, main = "PCoA: Taxonomic Distance Space")
# 5. PERMANOVA
groups <- c(
"xenarthra", "xenarthra", "carnivore", "ungulate", "marsupial",
"primate", "primate",
"carnivore", "carnivore", "carnivore",
"cetacean", "cetacean",
"sirenian", "ungulate", "ungulate"
)
vegan::adonis2(mat ~ groups)Anderson, M.J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26, 32–46.
Clarke, K.R. and Warwick, R.M. (1998). A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology, 35, 523–531.
Clarke, K.R. and Warwick, R.M. (2001). A further biodiversity index applicable to species lists: variation in taxonomic distinctness. Marine Ecology Progress Series, 216, 265–278.
Brands, S.J. (1989 onwards). Systema Naturae 2000. Amsterdam, The Netherlands. Retrieved from The Taxonomicon, http://taxonomicon.taxonomy.nl.