Title: | Spatial Phylogenetic Analysis |
Version: | 1.1.1 |
Description: | Analyze spatial phylogenetic diversity patterns. Use your data on an evolutionary tree and geographic distributions of the terminal taxa to compute diversity and endemism metrics, test significance with null model randomization, analyze community turnover and biotic regionalization, and perform spatial conservation prioritizations. All functions support quantitative community data in addition to binary data. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | ape, sf, stats, terra, vegan |
Suggests: | canaper, furrr, future, testthat (≥ 3.0.0), betapart, knitr, rmarkdown, tmap, magrittr, hillR, picante, phytools |
URL: | https://matthewkling.github.io/phylospatial/, https://github.com/matthewkling/phylospatial |
Config/testthat/edition: | 3 |
Depends: | R (≥ 3.5) |
VignetteBuilder: | knitr |
BugReports: | https://github.com/matthewkling/phylospatial/issues |
NeedsCompilation: | no |
Packaged: | 2025-05-02 13:34:53 UTC; matthewkling |
Author: | Matthew Kling |
Maintainer: | Matthew Kling <mattkling@berkeley.edu> |
Repository: | CRAN |
Date/Publication: | 2025-05-02 14:00:02 UTC |
phylospatial: Spatial Phylogenetic Analysis
Description
Analyze spatial phylogenetic diversity patterns. Use your data on an evolutionary tree and geographic distributions of the terminal taxa to compute diversity and endemism metrics, test significance with null model randomization, analyze community turnover and biotic regionalization, and perform spatial conservation prioritizations. All functions support quantitative community data in addition to binary data.
Author(s)
Maintainer: Matthew Kling mattkling@berkeley.edu (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/matthewkling/phylospatial/issues
Calculate taxon conservation benefit
Description
Nonlinear function that converts proportion of range conserved into conservation "benefit."
Usage
benefit(x, lambda = 1)
Arguments
x |
Fraction of taxon range protected (value between 0 and 1). |
lambda |
Shape parameter. |
Value
Value between 0 and 1.
Pairwise distances among clades or nodes
Description
This function runs ape::dist.nodes()
with some additional filtering and sorting. By default,
it returns distances between every pair of non-nested clades, i.e. every pair of collateral (non-lineal) nodes
including terminals and internal nodes.
Usage
clade_dist(tree, lineal = FALSE, edges = TRUE)
Arguments
tree |
A phylogeny of class |
lineal |
Logical indicating whether to retain distances for pairs of nodes that are lineal ancestors/descendants.
If |
edges |
Logical indicating whether to return a distance matrix with a row for every edge in |
Value
A matrix of pairwise distances between nodes.
Examples
clade_dist(ape::rtree(10))
Load California moss spatial phylogenetic data
Description
Get example phylospatial
data set based on a phylogeny and modeled distributions of 443 moss species
across California. This data set is a coarser version of data from Kling et al. (2024). It contains
occurrence probabilities, and is available in raster or polygon spatial formats.
Usage
moss(format = "raster")
Arguments
format |
Either "raster" (default) or "polygon" |
Value
a phylospatial
object
Source
Kling, Gonzalez-Ramirez, Carter, Borokini, and Mishler (2024) bioRxiv, https://doi.org/10.1101/2024.12.16.628580.
Examples
moss()
Create a spatial phylogenetic object
Description
This function creates a phylospatial
object. This is the core data type in the phylospatial library, and
is a required input to most other functions in the package. The two essential components of a spatial phylogenetic object
are a phylogenetic tree and an community data set.
Usage
phylospatial(
comm,
tree = NULL,
spatial = NULL,
data_type = c("auto", "probability", "binary", "abundance", "other"),
clade_fun = NULL,
build = TRUE,
check = TRUE,
area_tol = 0.01
)
Arguments
comm |
Community data representing the distribution of terminal taxa across sites. Can be a matrix with a column per terminal and
a row per site, a SpatRaster with one layer per terminal, or a |
tree |
Phylogeny of class phylo. Terminals whose names do not match |
spatial |
An optional |
data_type |
Character giving the data type of |
clade_fun |
Function to calculate the local community weight for a clade based on community weights for tips found in a given location.
Must be either NULL (the default, in which case the default function for the selected |
build |
Logical indicating whether |
check |
Logical indicating whether community data should be validated. Default is TRUE. |
area_tol |
Numeric value giving tolerance for variation in the area of sites. Default is |
Details
This function formats the input data as a phylospatial
object. Beyond validating, cleaning, and restructing the data, the main operation
it performs is to compute community occurrence data for every internal clade on the tree. For a given clade and site, community data for
all the terminals in the clade are used to calculate the clade's occurrence value in the site. As described above, this calculation can
happen in various ways, depending on what type of community data you have (e.g. binary, probability, or abundance) and how you want to
summarize them. By default, the function tries to detect your data_type
and use it to automatically select an appropriate summary
function as described above, but you can override this by providing your own function to clade_fun
.
You can also disable construction of the clade community matrix columns altogether by setting build = FALSE
). This is atypical, but you
might want to use this option if you have your own distribution data data on all clades (e.g. from modeling occurrence probabilities for
clades in addition to terminal species), or if your community data comes from a previously-constructed phylospatial
object.
Value
A phylospatial
object, which is a list containing the following elements:
- "data_type":
Character indicating the community data type
- "tree":
Phylogeny of class
phylo
- "comm":
Community matrix, including a column for every terminal taxon and every larger clade. Column order corresponds to tree edge order.
- "spatial":
A
SpatRaster
orsf
providing spatial coordinates for the rows incomm
. May be missing if no spatial data was supplied.- "dissim":
A community dissimilary matrix of class
dist
indicating pairwise phylogenetic dissimilarity between sites. Missing unlessps_dissim(..., add = TRUE)
is called.
Examples
# load species distribution data and phylogeny
comm <- terra::rast(system.file("extdata", "moss_comm.tif", package = "phylospatial"))
tree <- ape::read.tree(system.file("extdata", "moss_tree.nex", package = "phylospatial"))
# construct `phylospatial` object
ps <- phylospatial(comm, tree)
ps
Plot a phylospatial
object
Description
Plot a phylospatial
object
Usage
## S3 method for class 'phylospatial'
plot(x, y = c("tree", "comm"), max_taxa = 12, ...)
Arguments
x |
|
y |
Either |
max_taxa |
Integer giving the maximum number of taxon ranges to plot if |
... |
Additional arguments passed to plotting methods, depending on |
Value
A plot of the tree or community data.
Examples
ps <- ps_simulate()
plot(ps, "tree")
plot(ps, "comm")
Plot alternative lambda values
Description
Show a plot illustrating alternative values for the lambda
parameter in ps_prioritize. Lambda determines the shape of
the "benefit" function that determines the conservation value of protecting a given proportion of the geographic range of a
species or clade. Positive values place a higher priority on protecting additional populations of largely unprotected taxa,
whereas negative values place a higher priority on protecting additional populations of relatively well-protected taxa. The
default value used by ps_prioritize is 1.
Usage
plot_lambda(lambda = c(-1, -0.5, 0, 0.5, 2, 1))
Arguments
lambda |
A vector of lambda values to plot |
Value
Plots a figure
Examples
plot_lambda()
plot_lambda(seq(0, 3, .1))
Add community dissimilarity data to a phylospatial
object
Description
This function calculates pairwise phylogenetic dissimilarity between communities and returns the phylospatial
object with the dissimilarity data added as an element called dissim
. See ps_dissim for details.
Usage
ps_add_dissim(ps, method = "sorensen", ...)
Arguments
ps |
|
method |
Dissimilarity metric; see ps_dissim for details. |
... |
Additional arguments passed to ps_dissim, such as |
Value
ps
with a new dissim
element added.
Examples
ps <- ps_simulate(data_type = "prob")
ps_add_dissim(ps)
ps_add_dissim(ps, fun = "vegdist", method = "jaccard", endemism = TRUE)
Categorical Analysis of Neo- and Paleo-Endemism (CANAPE)
Description
This function classifies sites into areas of significant endemism according to the scheme of Mishler et al. (2014). Categorization is based on randomization quantile values for PE, RPE, and CE (which Mishler et al. call "PE on the comparison tree").
Usage
ps_canape(rand, alpha = 0.05)
Arguments
rand |
An object returned by running |
alpha |
Numeric value between 0 and 1 giving the one-tailed p-value threshold to use when determining significance. |
Details
Endemism significance categories are defined as follows:
Endemism not significant: neither PE nor CE are significantly high at
alpha
.Significant neoendemism: PE or CE are significantly high at
alpha
; RPE significantly low atalpha / 2
(two-tailed test).Significant paleoendemism: PE or CE are significantly high at
alpha
; RPE significantly high atalpha / 2
(two-tailed test)..Significant mixed-endemism: PE or CE are significantly high at
alpha
; RPE not significant.Significant super-endemism: PE or CE are significantly high at
alpha / 5
; RPE not significant.
Value
An object of the same class as rand
containing a variable called "canape"
, with values 0-4
corresponding to not-significant, mixed-, super-, neo-, and paleo-endemism, respectively.
References
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Examples
# classic CANAPE using binary data and the curveball algorithm
# (note that a real analysis would require a much higher `n_rand`)
set.seed(123456)
ps <- ps_simulate(data_type = "binary")
rand <- ps_rand(ps, metric = c("PE", "RPE", "CE"),
fun = "nullmodel", method = "curveball",
n_rand = 25, burnin = 10000, progress = FALSE)
canape <- ps_canape(rand)
terra::plot(canape)
Binary randomization tests including CANAPE
Description
This function is a wrapper around canaper::cpr_rand_test()
. It only works with binary community data.
It is largely redundant with ps_rand()
and ps_canape()
, which are more flexible in supporting data sets
with non-binary community data. However, this function runs faster, and supports custom null models via
make.commsim.
Usage
ps_canaper(ps, null_model = "curveball", spatial = TRUE, ...)
Arguments
ps |
phylospatial object |
null_model |
see |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a vector (FALSE). |
... |
further arguments passed to |
Details
This function runs canaper::cpr_rand_test()
; see the help for that function for details.
It also runs canaper::cpr_classify_endem()
on the result, and includes the resulting classification as an
additional variable, 'endem_type', in the output. 'endem_type' values 0-4 correspond to not-significant, neo,
paleo, mixed, and super endemesim, respectively.
Value
A matrix
or SpatRaster
, or sf
with a column or layer for each metric.
References
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Nitta, J. H., Laffan, S. W., Mishler, B. D., & Iwasaki, W. (2023). canaper: categorical analysis of neo‐and paleo‐endemism in R. Ecography, 2023(9), e06638.
See Also
Examples
if(requireNamespace("canaper")){
ps <- ps_simulate(data_type = "binary")
terra::plot(ps_canaper(ps)$pd_obs_p_upper)
}
Quantitative phylogenetic dissimilarity
Description
This function calculates pairwise phylogenetic dissimilarity between communities. It works with both binary and
quantitative community data sets. A wide range of phylogentic community dissimilarity metrics are supported,
including phylogenetic Sorensen's and Jaccard's distances, turnover and nestedness components of Sorensen's distance
(Baselga & Orme, 2012), and phylogenetic versions of all community distance indices provided through the vegan
library.
The function also includes options to scale the community matrix in order to focus the analysis on endemism and/or
on proportional differences in community composition. The results from this function can be visualized using
ps_rgb or ps_regions, or used in a variety of statistical analyses.
Usage
ps_dissim(
ps,
method = "sorensen",
fun = c("vegdist", "designdist", "chaodist"),
endemism = FALSE,
normalize = FALSE,
...
)
Arguments
ps |
phylospatial object. |
method |
Character indicating the dissimilarity index to use:
|
fun |
Character indicating which general distance function from the |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes) to derive endemism before computing distances. |
normalize |
Logical indicating whether community values should be divided by row totals (community sums) before
computing distances. If |
... |
Additional arguments passed to |
Value
A pairwise phylogenetic dissimilarity matrix of class dist
.
References
Graham, C. H., & Fine, P. V. (2008). Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time. Ecology Letters, 11(12), 1265-1277.
Baselga, A., & Orme, C. D. L. (2012). betapart: an R package for the study of beta diversity. Methods in Ecology and Evolution, 3(5), 808-812.
Pavoine, S. (2016). A guide through a family of phylogenetic dissimilarity measures among sites. Oikos, 125(12), 1719-1732.
See Also
Examples
# example data set:
ps <- ps_simulate(n_tips = 50)
# The default arguments give Sorensen's quantitative dissimilarity index
# (a.k.a. Bray-Curtis distance):
d <- ps_dissim(ps)
# Specifying a custom formula explicitly via `designdist`;
# (this is the Bray-Curtis formula, so it's equivalent to the prior example)
d <- ps_dissim(ps, method = "(b+c)/(2*a+b+c)",
fun = "designdist", terms = "minimum", abcd = TRUE)
# Alternative arguments can specify a wide range of dissimilarity measures;
# here's endemism-weighted Jaccard's dissimilarity:
d <- ps_dissim(ps, method = "jaccard", endemism = TRUE)
Calculate spatial phylogenetic diversity and endemism metrics
Description
This function calculates a variety of alpha phylogenetic diversity metrics, including measures of richness, regularity, and divergence. If continuous community data (probabilities or abundances) are provided, they are used in calculations, giving quantitative versions of the classic binary metrics.
Usage
ps_diversity(ps, metric = c("PD", "PE", "CE", "RPE"), spatial = TRUE)
Arguments
ps |
phylospatial object (created by |
metric |
Character vector containing the abbreviation for one or more diversity metrics listed in
the details below. Can also specify |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
Details
The function calculates the following metrics. Endemism-weighted versions of most metrics are available. All metrics are weighted by occurrence probability or abundance, if applicable.
Richness measures:
-
TD—Terminal Diversity, i.e. richness of terminal taxa (in many cases these are species):
\sum_{t}{p_t}
-
TE—Terminal Endemism, i.e. total endemism-weighted diversity of terminal taxa, a.k.a. "weighted endemism":
\sum_{t}{p_t r_t^{-1}}
-
CD—Clade Diversity, i.e. richness of taxa at all levels (equivalent to PD on a cladogram):
\sum_{b}{p_b}
-
CE—Clade Endemism, i.e. total endemism-weighted diversity of taxa at all levels (equivalent to PE on a cladrogram):
\sum_{b}{p_b r_b^{-1}}
-
PD—Phylogenetic Diversity, i.e. total branch length occurring in a site:
\sum_{b}{L_b p_b}
-
PE—Phylogenetic Endemism, i.e. endemism-weighted PD:
\sum_{b}{L_b p_b r_b^{-1}}
-
ShPD—Shannon Phylogenetic Diversity, a.k.a. "phylogenetic entropy" (this version is the log of the "effective diversity" version based on Hill numbers):
-\sum_{b}{L_b n_b log(n_b)}
-
ShPE—Shannon phylogenetic Endemism, an endemism-weighted version of ShPD:
-\sum_{b}{L_b n_b log(e_b) r_b^{-1}}
-
SiPD—Simpson Phylogenetic Diversity:
1 / \sum_{b}{L_b n_b^2}
-
SiPE—Simpson Phylogenetic Endemism, an endemism-weighted version of SiPD:
1 / \sum_{b}{L_b r_b^{-1} e_b^2}
Divergence measures:
-
RPD—Relative Phylogenetic Diversity, i.e. mean branch segment length (equivalent to PD / CR):
\sum_{b}{L_b p_b} / \sum_{b}{p_b}
-
RPE—Relative Phylogenetic Endemism, i.e. mean endemism-weighted branch segment length (equivalent to PE / CE):
\sum_{b}{L_b p_b r_b^{-1}} / \sum_{b}{p_b r_b^{-1}}
-
MPDT—Mean Pairwise Distance between Terminals, i.e. the classic MPD metric. This is the average of cophenetic distances, weighted by
p_t
. -
MPDN—Mean Pairwise Distance between Nodes, an experimental version of MPD that considers distances between every pair of non-nested clades, putting more weight on deeper branches than does MPDT. This is the mean of distances between all collateral (non-lineal) node pairs including terminal and internal nodes, weighted by
p_b
. Note that divergence can also be assessed by using
ps_rand()
to run null model analyses of richness measures like PD.
Regularity measures:
-
VPDT—Variance in Pairwise Distances between Terminals, i.e. the classic VPD metric, weighted by
p_t
. -
VPDN—Variance in Pairwise Distances between Nodes, i.e. MPDN but variance.
In the above equations, b
indexes all taxa including terminals and larger clades; t
indexes terminals only; p_i
is the occurrence value
(binary, probability, or abundance) of clade/terminal i
in a given community; L_b
is the
length of the phylogenetic branch segment unique to clade b
; and r_i
is the sum of p_i
across all sites. For Shannon
and Simpson indices, only nonzero elements of p_b
are used, n_b = p_b / \sum_{b}{p_b L_b}
, and e_b = p_b / \sum_{b}{p_b L_b r_b^{-1}}
.
Value
A matrix, sf
data frame, or SpatRaster
with a column or layer for each requested diversity metric.
References
Faith, D. P. (1992). Conservation evaluation and phylogenetic diversity. Biological Conservation, 61(1), 1-10.
Laffan, S. W., & Crisp, M. D. (2003). Assessing endemism at multiple spatial scales, with an example from the Australian vascular flora. Journal of Biogeography, 30(4), 511-520.
Rosauer, D. A. N., Laffan, S. W., Crisp, M. D., Donnellan, S. C., & Cook, L. G. (2009). Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology, 18(19), 4061-4072.
Allen, B., Kon, M., & Bar-Yam, Y. (2009). A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. The American Naturalist, 174(2), 236-243.
Chao, A., Chiu, C. H., & Jost, L. (2010). Phylogenetic diversity measures based on Hill numbers. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1558), 3599-3609.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Tucker, C. M., Cadotte, M. W., Davies, T. J., et al. (2016) A guide to phylogenetic metrics for conservation, community ecology and macroecology. Biological Reviews, 92(2), 698-715.
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
Examples
ps <- ps_simulate()
div <- ps_diversity(ps)
terra::plot(div)
Get phylospatial
community data
Description
Get phylospatial
community data
Usage
ps_get_comm(ps, tips_only = TRUE, spatial = TRUE)
Arguments
ps |
|
tips_only |
Logical indicating whether only the terminal taxa (TRUE, the default) or all taxa (FALSE) should be returned. |
spatial |
Logical indicating whether a spatial ( |
Value
Either a SpatRaster
with a layer for every taxon, or an sf
data frame with a variable for every taxon,
depending on which data type was used to create ps
.
Examples
ps <- ps_simulate()
# the defaults return a spatial object of terminal taxa distributions:
ps_get_comm(ps)
# get distributions for all taxa, as a matrix
pcomm <- ps_get_comm(ps, tips_only = FALSE, spatial = FALSE)
Community phylogenetic ordination
Description
Perform an ordination that reduces a spatial phylogenetic data set into k
dimensions, using one of
several alternative ordination algorithms.
Usage
ps_ordinate(ps, method = c("nmds", "cmds", "pca"), k = 3, spatial = TRUE)
Arguments
ps |
A |
method |
Ordination method, either "pca" (principal component analysis implemented via |
k |
Positive integer giving the desired number of output dimensions; default is |
spatial |
Logical indicating whether a spatial object (inherited from |
Value
A matrix or spatial object with k
variables.
See Also
For visualization using ordination onto RGB color space, see ps_rgb()
.
Examples
ps <- ps_add_dissim(ps_simulate(50, 5, 5))
ord <- ps_ordinate(ps, method = "cmds", k = 4)
terra::plot(ord)
Phylogenetic conservation prioritization
Description
Create a ranking of conservation priorities using optimal or probabilistic forward stepwise selection. Prioritization accounts for the occurrence quantities for all lineages present in the site, including terminal taxa and larger clades; the evolutionary branch lengths of these lineages on the phylogeny, which represent their unique evolutionary heritage; the impact that protecting the site would have on these lineages' range-wide protection levels; the compositional complementarity between the site, other high-priority sites, and existing protected areas; the site's initial protection level; the relative cost of protecting the site; and a free parameter "lambda" determining the shape of the conservation benefit function.
Usage
ps_prioritize(
ps,
init = NULL,
cost = NULL,
lambda = 1,
protection = 1,
max_iter = NULL,
method = c("optimal", "probable"),
trans = function(x) replace(x, which(rank(-x) > 25), 0),
n_reps = 100,
n_cores = 1,
summarize = TRUE,
spatial = TRUE,
progress = interactive()
)
Arguments
ps |
phylospatial object. |
init |
Optional numeric vector or spatial object giving the starting protection status of each site across the study area. Values should be between 0 and 1 and represent the existing level of conservation effectiveness in each site. If this argument is not specified, it is assumed that no existing reserves are present. |
cost |
Optional numeric vector or spatial object giving the relative cost of protecting each site. Values should be positive, with greater values indicating higher cost of conserving a site. If this argument is not specified, cost is assumed to be uniform across sites. |
lambda |
Shape parameter for taxon conservation benefit function. This can be any real number. Positive values, such as the default
value |
protection |
Degree of protection of proposed new reserves (number between 0 and 1, with same meaning as |
max_iter |
Integer giving max number of iterations to perform before stopping, i.e. max number of sites to rank. |
method |
Procedure for selecting which site to add to the reserve network at each iteration:
|
trans |
A function that transforms marginal values into relative selection probabilities; only used if |
n_reps |
Number of random repetitions to do; only used if |
n_cores |
Number of compute cores to use for parallel processing; only used if |
summarize |
Logical: should summary statistics across reps (TRUE, default) or the reps themselves (FALSE) be returned? Only relevant
if |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
progress |
Logical: should a progress bar be displayed? |
Details
This function uses the forward stepwise selection algorithm of Kling et al. (2019) to generate a ranked conservation prioritization.
Prioritization begins with the starting protected lands network identified in init
, if provided. At each iteration, the marginal
conservation value of fully protecting each site is calculated, and a site is selected to be added to the reserve network. Selection can
happen either in an "optimal" or "probable" fashion as described under the method
argument. This process is repeated until all sites
are fully protected or until max_iter
has been reached, with sites selected early in the process considered higher conservation
priorities.
The benefit of the probabilistic approach is that it relaxes the potentially unrealistic assumption that protected land will actually be added in the optimal order. Since the algorithm avoids compositional redundancy between high-priority sites, the optimal approach will never place high priority on a site that has high marginal value but is redundant with a slightly higher-value site, whereas the probabilistic approach will select them at similar frequencies (though never in the same randomized run).
Every time a new site is protected as the algorithm progresses, it changes the marginal conservation value of the other sites. Marginal
value is the increase in conservation benefit that would arise from fully protecting a given site, divided by the cost of protecting the
site. This is calculated as a function of the site's current protection level, the quantitative presence probability or abundance of all
terminal taxa and larger clades present in the site, their evolutionary branch lengths on the phylogeny, the impact that protecting the
site would have on their range-wide protection levels, and the free parameter lambda
. lambda
determines the relative importance of
protecting a small portion of every taxon's range, versus fully protecting the ranges of more valuable taxa (those with longer
evolutionary branches and smaller geographic ranges).
Value
Matrix or spatial object containing a ranking of conservation priorities. Lower rank values represent higher
conservation priorities. All sites with a lower priority than max_iter
have a rank value equal to the number
of sites in the input data set (i.e. the lowest possible priority).
- If
method = "optimal"
. the result contains a single variable "priority" containing the ranking.
- If
method = "probable"
andsummarize = TRUE
, the "priority" variable gives the average rank across reps, variables labeled "pctX" give the Xth percentile of the rank distribution for each site, variables labeled "topX" give the proportion of reps in which a site was in the top X highest-priority sites, and variables labeled "treX" give a ratio representing "topX" relative to the null expectation of how often "topX" should occur by chance alone.
- If
method = "probable"
andsummarize = FALSE
, the result contains the full set of
n_rep
solutions, each representing the the ranking, with low values representing higher priorities..
References
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
See Also
Examples
# simulate a toy `phylospatial` data set
set.seed(123)
ps <- ps_simulate()
# basic prioritization
p <- ps_prioritize(ps)
# specifying locations of initial protected areas
# (can be binary, or can be continuous values between 0 and 1)
# here we'll create an `init` raster with arbitrary values ranging from 0-1,
# using the reference raster layer that's part of our `phylospatial` object
protected <- terra::setValues(ps$spatial, seq(0, 1, length.out = 400))
cost <- terra::setValues(ps$spatial, rep(seq(100, 20, length.out = 20), 20))
p <- ps_prioritize(ps, init = protected, cost = cost)
# using probabilistic prioritization
p <- ps_prioritize(ps, init = protected, cost = cost,
method = "prob", n_reps = 1000, max_iter = 10)
terra::plot(p$top10)
Null model randomization analysis of alpha diversity metrics
Description
This function compares phylodiversity metrics calculated in ps_diversity to their null distributions computed by randomizing the community matrix or shuffling the tips of the phylogeny, indicating statistical significance under the assumptions of the null model. Various null model algorithms are available for binary, probability, and count data.
Usage
ps_rand(
ps,
metric = c("PD", "PE", "RPE", "CE"),
fun = "quantize",
method = "curveball",
n_rand = 100,
summary = "quantile",
spatial = TRUE,
n_cores = 1,
progress = interactive(),
...
)
Arguments
ps |
|
metric |
Character vector giving one or more diversity metrics to calculate; see ps_diversity
for options. Can also specify |
fun |
Null model function to use. Must be either "tip_shuffle", "nullmodel", "quantize", or an actual function:
|
method |
One of the method options listed under commsim. If |
n_rand |
Integer giving the number of random communities to generate. |
summary |
Character indicating which summary statistic to return. If |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE). |
n_cores |
Integer giving the number of compute cores to use for parallel processing. |
progress |
Logical: should a progress bar be displayed? |
... |
Additional arguments passed to quantize, simulate.nullmodel, or custom function
|
Value
A matrix with a row for every row of x
, a column for every metric specified in metric
, and
values for the summary
statistic. Or if spatial = TRUE
, a sf
or SpatRaster
object containing these data.
See Also
Examples
# simulate a `phylospatial` data set and run randomization with default settings
ps <- ps_simulate(data_type = "prob")
rand <- ps_rand(ps)
# using the default `quantize` function, but with alternative arguments
rand <- ps_rand(ps, transform = sqrt, n_strata = 4, priority = "rows")
# using binary data
ps2 <- ps_simulate(data_type = "binary")
rand <- ps_rand(ps2, fun = "nullmodel", method = "r2")
# using abundance data, and demonstrating alternative metric choices
ps3 <- ps_simulate(data_type = "abund")
rand <- ps_rand(ps3, metric = c("ShPD", "SiPD"), fun = "nullmodel", method = "abuswap_c")
rand
Cluster analysis to identify phylogenetic regions
Description
Perform a clustering analysis that categorizes sites into biogeographic regions based on phylogenetic community compositional similarity.
Usage
ps_regions(ps, k = 5, method = "average", endemism = FALSE, normalize = TRUE)
Arguments
ps |
A |
k |
Number of spatial clusters to divide the region into (positive integer). See ps_regions_eval to help choose a value of k by comparing the variance explained by different numbers of regions. |
method |
Clustering method. Options include all methods listed under hclust, and |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes)
to derive endemism. Only used if |
normalize |
Logical indicating whether community values should be divided by row totals (community sums). If |
Value
A raster or matrix with an integer indicating which of the k
regions each site belongs to.
References
Daru, B. H., Elliott, T. L., Park, D. S., & Davies, T. J. (2017). Understanding the processes underpinning patterns of phylogenetic regionalization. Trends in Ecology & Evolution, 32(11), 845-860.
Examples
ps <- ps_simulate()
# using kmeans clustering algorithm
terra::plot(ps_regions(ps, method = "kmeans"))
# to use a hierarchical clustering method, first we have to `ps_add_dissim()`
terra::plot(ps_regions(ps_add_dissim(ps), k = 7, method = "average"))
Evaluate region numbers
Description
This function compares multiple potential values for k
, the number of clusters in to use in ps_regions()
,
to help you decide how well different numbers of regions fit your data set. For each value of k
, it performs
a cluster analysis and calculates the proportion of total variance explained (SSE, the sum of squared pairwise
distances explained). It also calculates second-order metrics of the relationship between k and SSE. While
many data sets have no optimal value of k
and the choice is often highly subjective, these evaluation metrics
can help you identify potential points where the variance explained stops increasing quickly as k
increases.
Usage
ps_regions_eval(ps, k = 1:20, plot = TRUE, ...)
Arguments
ps |
A |
k |
Vector of positive integers giving possible values for |
plot |
Logical indicating whether to print a plot of the results ( |
... |
Further arguments passed to ps_regions. |
Value
The function generates a data frame with the following columns. If plot = FALSE
the data frame is
returned, otherwise the function prints a plot of the latter variables as a function of k
:
"k": The number of clusters.
"sse": The proportion of total variance explained, with variance defined as squared pairwise community phylogenetic dissimilarity between sites.
"curvature": The local second derivative. Lower (more negative) values indicate more attractive break-point values of k.
"dist11": The distance from the point to the 1:1 line on a plot of
k
vssse
in which k values over the interval from 1 to the number of sites are rescaled to the unit interval. Higher values indicate more attractive values for k.
Examples
ps <- ps_add_dissim(ps_simulate())
ps_regions_eval(ps, k = 1:15, plot = TRUE)
Map phylospatial data onto RGB color bands
Description
Perform an ordination that reduces a spatial phylogenetic data set into three dimensions that can be
plotted as the RGB bands of color space to visualize spatial patterns of community phylogenetic composition.
This function is a wrapper around ps_ordinate()
.
Usage
ps_rgb(ps, method = c("nmds", "cmds", "pca"), trans = identity, spatial = TRUE)
Arguments
ps |
A |
method |
Ordination method, either "pca" (principal component analysis implemented via |
trans |
A function giving a transformation to apply to each dimension of the ordinated data.
The default is the identity function. Specifying |
spatial |
Logical indicating whether a spatial object (inherited from |
Value
A matrix or spatial object with three variables containing RGB color values in the range 0-1.
Examples
ps <- ps_add_dissim(moss())
RGB <- ps_rgb(ps, method = "cmds")
terra::plotRGB(RGB * 255, smooth = FALSE)
Simulate a toy spatial phylogenetic data set
Description
This function generates a simple phylospatial
object that can be used for testing other functions
in the package. It is not intended to be realistic.
Usage
ps_simulate(
n_tips = 10,
n_x = 20,
n_y = 20,
data_type = c("probability", "binary", "abundance"),
spatial_type = c("raster", "none"),
seed = NULL
)
Arguments
n_tips |
Number of terminals on phylogeny. |
n_x |
Number of raster cells in x dimension of landscape. |
n_y |
Number of raster cells in y dimension of landscape. |
data_type |
Community data type for simulated ranges: either "probability" (default), "binary", or "abundance". |
spatial_type |
Either "raster" or "none". |
seed |
Optional integer to seed random number generator. |
Value
phylospatial
object, comprising a random phylogeny and community matrix in which each terminal has a
circular geographic range with a random radius and location. The spatial reference data is a SpatRaster.
Examples
# using all the defaults
ps_simulate()
# specifying some arguments
plot(ps_simulate(n_tips = 50, n_x = 30, n_y = 40, data_type = "abundance"), "comm")
Stratified randomization of community matrix
Description
This is a community null model method for quantitative community data (e.g. abundance or occurrence probability).
It is designed to adapt binary null model algorithms for use with quantitative data, which can be useful if
there is not a quantitative-specific algorithm available that has the desired properties. For example, use with
the binary "curveball" algorithm preserves row and column totals, and also approximately preserves the marginal
distributions of rows and columns. For each randomization, the data set is split into strata representing numerical ranges
of the input quantities, a separate binary randomization is done for each stratum, and the results are
combined to produce a randomized, quantitative community matrix. See vegan::commsim()
for details about
other binary and quantitative null models.
Usage
quantize(x, method = "curveball", ...)
Arguments
x |
Community matrix with species in rows, sites in columns, and nonnegative quantities in cells. |
method |
Null model algorithm, passed to |
... |
Additional arguments, including:
|
Value
A randomized version of x
.
Examples
# example quantitative community matrix
comm <- ps_get_comm(moss("polygon"), tips_only = TRUE, spatial = FALSE)[1:50, 1:50]
# examples of different quantize usage
rand <- quantize(comm)
rand <- quantize(comm, n_strata = 4, transform = sqrt, priority = "rows")
rand <- quantize(comm, method = "swap", burnin = 10)
# (note: this `burnin` value is far too small for a real analysis)
Convert a site-by-variable matrix into a SpatRaster or sf object
Description
Convert a site-by-variable matrix into a SpatRaster or sf object
Usage
to_spatial(m, template)
Arguments
m |
Matrix or vector. |
template |
|
Value
SpatRaster
with a layer for every column in m
, or sf
data frame with
a variable for every column in m
, depending on the data type of template
.
Examples
ps <- moss()
to_spatial(ps$comm[, 1:5], ps$spatial)