--- title: "ecoTolerance: A Comprehensive Tutorial" author: - name: "Diego F. Miranda" email: "dfernandes115@gmail.com" affiliation: "Universidade Federal da Bahia" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: | %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{ecoTolerance: A Comprehensive Tutorial} %\usepackage{ae} editor_options: markdown: wrap: 72 --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction The **ecoTolerance** package is designed to calculate two key indices for ecological studies involving species occurrences: 1. **Road Tolerance Index (RTI)** 2. **Human Footprint Tolerance Index (HFTI)** These indices enable researchers and conservationists to assess how tolerant different species are to roads and human footprint, respectively. The package is particularly useful in biodiversity and conservation biology studies, and is part of an ongoing PhD thesis focused on amphibian species in Brazil. > **Why use ecoTolerance?** > > - Automates data cleaning (removal of duplicates, filtering of > spatially nearby points). > > - Integrates spatial data (road networks, human footprint rasters). > > - Provides reproducible and consistent calculations of RTI and HFTI. > > - Offers built-in functions to produce maps and density plots for > all species. > > - Supports both internal example data (Brazilian roads + human > footprint rasters) and external user-supplied layers (shapefiles & > rasters from any region). # Features of ecoTolerance - **Data Preprocessing**\ The function `process_occurrences()` converts a data frame of raw occurrences (columns: `species`, `longitude`, `latitude`) into a `sf` object, removes exact duplicates, and optionally filters out occurrences within a user-defined buffer to avoid pseudo-replication in ecological analyses. - **Road Tolerance Index (RTI)**\ The function `calculate_RTI()` computes RTI for each record based on the distance to the nearest road and a reference distance, both in km. By default: ![](images/hti.png) The function returns an `sf` object of occurrences with a new colum `RTI_value` and a data frame of **median RTI per species**. - **Human Footprint Tolerance Index (HFTI)**\ The function `calculate_HFTI()` calculates HFTI based on the value of human footprint in a buffer around each point. Bt default, HFTI is: ![](images/hfti.png) where **Human Footprint Value~*i*~**is the average human footprint raster value within a buffer (e.g. 1km) of occurrence i, and `divisor` (default = 50) normalizes those values to 0 - 1. `calculate_HFTI()` returns an `sf` object of occurrences with a new column `HFTI_value` and an data frame of median **HFTI per species**. - **Single-Function Workflow**\ `compute_indices()` orchestrates all steps in one call: 1. Cleans occurrences via `process_occurrences()`. 2. Loads roads and human footprint raster (either toy data or full data - for Brazil). 3. Calculates RTI with `calculate_RTI()`. 4. Calculates HFTI with `calculate_HFTI()`. 5. Merges median RTI and HFTI per species into a single data frame. It returns a list containing: - `RTI`: data frame of RTI per species. - `HFTI`: data frame of HFTI per species. - `indices`: combined data frame (`species` \| `RTI` \| `HFTI`) - `processed_data`: the final `sf` with columns `RTI_value` and `HFTI_value` for each occurrence. - **Single-Function Workflow**\ The function `generate_all_reports()` takes the output of `compute_indices()`, a shapefile of the study area, and an output directory. It automatically: 1. Creates a file `indices_por_especie.csv` (with columns species, RTI and HFTI). 2. For each unique species in processed_data, generates four PNG figures: - Map of RTI (occurrence points colored by `RTI_value`). - Map of HFTI (occurrence points colored by `HFTI_value`) - Density plot of RTI (with a vertical line at the median). - Density plot of HFTI (with a vertical line at the median). All files are saved in the user-specified `out_dir`. - **Use Internal or External Spatial Layers**\ By default, `compute_indices()` calls internal helper functions `load_roads()` and `load_human_footprint()` to read example layers (Brazil roads + Brazil human footprint ot toy data). You can also supply your own shapefiles and rasters by "monkey-patching" those two functions in your script, so that no internal code of the package needs to be changed. # Installation To install the development version from your local repository, you can use: ``` r devtools::install("path/to/ecoTolerance") ``` Make sure you have the required packages installed (`sf`, `raster`, `dplyr`, `stats`) # 1. Quick Start Example (Toy Data) By default, **ecoTolerance** runs in "Toy Mode". It uses a small, lightweight dataset included in the package. ### 1.1 Load the package and Data ``` r library(ecoTolerance) # Sample occurrence data (in Bahia, where the Toy dataset is located) sample_data <- data.frame( species = c("SpA", "SpA", "SpB", "SpB"), longitude = c(-38.40, -38.41, -38.42, -38.43), latitude = c(-12.60, -12.61, -12.62, -12.63) ) ``` ### 1.2 Compute Indices (Toy Mode) The `compute_indices()` function uses `data_type= "toy"` by default. ``` r result <- compute_indices( data = sample_data, remove_duplicates = TRUE, buffer_km = 1, ref_dist = 3.5, divisor = 50, data_type = "toy" # Default ) # Inspect results head(result$indices) ``` ### 1.3 Automatic Reports To automatically save maps, plots, and CSVs for every species: ``` r # 1) Shapefile of the study area (Optional, for context in the map) # For this example, we create a dummy box, but you would load your shapefile. library(sf) study_area_sf <- st_as_sf( data.frame(id = 1), geometry = st_sfc(st_polygon(list(rbind(c(-39, -13), c(-38, -13), c(-38, -12), c(-39, -12), c(-39, -13))))), crs = 4326 ) # 2) Output directory output_dir <- tempdir() # Or "C:/MyResults" # 3) Generate all maps & plots generate_all_reports( result = result, area_shapefile = study_area_sf, # Can also be a path to a .shp file out_dir = output_dir ) ``` ### 2. Using Full Internal Data (Brazil) To perform a real analysis covering the entire Brazilian territory, you simply need to change the `data_type` argument to `"full"`. When you run this for the first time, the package will automatically: 1. Download the full Brazilian roads shapefile and Human Footprint raster from the Zenodo repository (\~60 MB). 2. Save these files in a persistent cache on your computer (tools::R_user_dir). 3. Load them for analysis. Future runs will load directly from the cache, so you only download once. ``` r # Real occurrence data anywhere in Brazil real_data <- data.frame( species = c("Frog X", "Frog Y"), longitude = c(-45.32, -40.10), latitude = c(-15.20, -18.50) ) # Run with full dataset # Note: The first time you run this, it will take a few minutes to download. result_full <- compute_indices( data = real_data, data_type = "full", # <--- This triggers the download/cache load ref_dist = 3.5 ) ``` ### 3. Using External Shapefiles & Rasters If your study area is outside Brazil, or you have custom high-resolution data, you can supply your own layers using the "monkey-patch" approach. ``` r library(ecoTolerance) # 1) Read your own data occ_data <- read.csv("path/to/my_occurrences.csv") # 2) Overwrite load_roads() and load_human_footprint() in your global environment. # IMPORTANT: They must accept arguments (like 'type'), even if you don't use them. load_roads <- function(type="toy") { # Argument 'type' is ignored here, we just return our custom file roads_sf <- sf::st_read("path/to/my_custom_roads.shp", quiet = TRUE) if (sf::st_crs(roads_sf)$epsg != 4326) { roads_sf <- sf::st_transform(roads_sf, crs = 4326) } return(roads_sf) } load_human_footprint <- function(type="toy") { footprint_raster <- raster::raster("path/to/my_custom_footprint.tif") return(footprint_raster) } # 3) Now call compute_indices() # It will use your custom functions instead of the package internals result_custom <- compute_indices( data = occ_data ) ``` ### 4. Manual Workflow If you prefer to run step-by-step instead of using `compute_indices()`, here is how the package works internally. #### 4.1 Process Occurrences Run `process_occurrences()` to remove exact duplicates and optionally filter out records closer than 1km: ``` r processed_data <- process_occurrences( data = sample_data, remove_duplicates = TRUE, buffer_km = 1 ) ``` - `remove_duplicates = TRUE` removes identical coordinates for the same species. - `buffer_km = 1` ensures only one record per species within a 1 km radius. `processed_data` is now an `sf` object in EPSG:4326. #### 4.2 Calculating RTI and HFTI Directly If you want to calculate RTI and HFTI in separate steps, you can do: ``` r # 1) Load internal example layers (or full) roads <- load_roads(type = "toy") footprint <- load_human_footprint(type = "toy") # 2) Calculate RTI rti_result <- calculate_RTI( occurrences = processed_data, roads = roads, ref_dist = 3.5 ) # 3) Calculate HFTI # Note: we use rti_result$data because it now contains the RTI values hfti_result <- calculate_HFTI( occurrences = rti_result$data, footprint = footprint, divisor = 50, buffer_m = 1000 ) # 4) Inspect the outputs head(rti_result$RTI) # median RTI per species head(hfti_result$HFTI) # median HFTI per species ``` #### 4.3 Plotting Individual Maps If you want to visualize the computed indices spatially, you can do: ``` r library(ggplot2) # For RTI: ggplot(hfti_result$data) + geom_sf(aes(color = RTI_value)) + scale_color_viridis_c(name = "RTI") + theme_minimal() + ggtitle("Road Tolerance Index (RTI) per Occurrence") # For HFTI: ggplot(hfti_result$data) + geom_sf(aes(color = HFTI_value)) + scale_color_viridis_c(name = "HFTI") + theme_minimal() + ggtitle("Human Footprint Tolerance Index (HFTI) per Occurrence") ``` ## Advanced Configuration - **Buffer distance** (`buffer_km`): adjust according to how fine_scaled your occurrence data is. A larger buffer removes points too close to each other withing that radius. - **Reference Distance** (`ref_dist`) for **RTI**: choose a value (in km) that makes sense biologically. Amphibians might need a larger reference distance (e.g. 3.5km) than plants (e.g. 1.5km) for instance. - **Divisor for HFTI**: set `divisor` for the maximum possible raster value. If your raster ranges 0-100, use `divisor = 100`. The default in the package is 50. - **CRS Consistency**: `calculate_RTI()` expects both occurrences and roads in EPSG:4326 (latitude/longitude). Internally, `st_distance()` does a geodetic calculation in meters. `calculate_HFTI()` assumes the human footprint raster is in a projected CRS (e.g. UTM) or at least geographic CRS in meters. If in degrees, the buffer will be interpreted in degrees (not recommended). The vignette examples always reproject to 4326 if necessary, but ideally your raster should already be in a metric projection. # Conclusion The **ecoTolerance** package streamlines the process of computing road and human footprint tolerance indices for ecological and conservation studies. By combining robust data processing with reproducible spatial calculations, it aids researchers in understanding species' responses to anthropogenic disturbances. > **Citation / Reference** > > - Please cite this package or the relevant paper if it is part of > your paper. > > - For academic citation, you can use: ``` r citation("ecoTolerance") ``` # Acknowledgments - We thank all the collaborators and advisors involved in this research. - This package is part of an ongoing PhD thesis at *Programa de Pós-Graduação em Ecologia: Teoria, Aplicação e Valores of Universidade Federal da Bahia*. # References - MapBiomas. *Coleção de Dados do Módulo de Infraestrutura.* 2022. - Miranda, D. F., Forti, L. R. (in preparation). *ecoTolerance: An R Package for Assessing Road and Human Footprint Tolerance in Wildlife Species*. - Mu, H. et al. *A Global Record of Annual Terrestrial Human Footprint Dataset From 2000 to 2018.* Scientific Data, 9(1), 176.