---
title: "ecoTolerance: A Comprehensive Tutorial"
author:
  - name: "Diego F. Miranda"
    email: "dfernandes115@gmail.com"
    affiliation: "Universidade Federal da Bahia"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: |
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{ecoTolerance: A Comprehensive Tutorial}
  %\usepackage{ae}
editor_options:
  markdown:
    wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

```

# Introduction

The **ecoTolerance** package is designed to calculate two key indices
for ecological studies involving species occurrences:

1.  **Road Tolerance Index (RTI)**

2.  **Human Footprint Tolerance Index (HFTI)**

These indices enable researchers and conservationists to assess how
tolerant different species are to roads and human footprint,
respectively. The package is particularly useful in biodiversity and
conservation biology studies, and is part of an ongoing PhD thesis
focused on amphibian species in Brazil.

> **Why use ecoTolerance?**
>
> -   Automates data cleaning (removal of duplicates, filtering of
>     spatially nearby points).
>
> -   Integrates spatial data (road networks, human footprint rasters).
>
> -   Provides reproducible and consistent calculations of RTI and HFTI.
>
> -   Offers built-in functions to produce maps and density plots for
>     all species.
>
> -   Supports both internal example data (Brazilian roads + human
>     footprint rasters) and external user-supplied layers (shapefiles &
>     rasters from any region).

# Features of ecoTolerance

-   **Data Preprocessing**\
    The function `process_occurrences()` converts a data frame of raw
    occurrences (columns: `species`, `longitude`, `latitude`) into a
    `sf` object, removes exact duplicates, and optionally filters out
    occurrences within a user-defined buffer to avoid pseudo-replication
    in ecological analyses.

-   **Road Tolerance Index (RTI)**\
    The function `calculate_RTI()` computes RTI for each record based on
    the distance to the nearest road and a reference distance, both in
    km. By default:

    ![](images/hti.png)

    The function returns an `sf` object of occurrences with a new colum
    `RTI_value` and a data frame of **median RTI per species**.

-   **Human Footprint Tolerance Index (HFTI)**\
    The function `calculate_HFTI()` calculates HFTI based on the value
    of human footprint in a buffer around each point. Bt default, HFTI
    is:

    ![](images/hfti.png)

    where **Human Footprint Value~*i*~**is the average human footprint
    raster value within a buffer (e.g. 1km) of occurrence i, and
    `divisor` (default = 50) normalizes those values to 0 - 1.
    `calculate_HFTI()` returns an `sf` object of occurrences with a new
    column `HFTI_value` and an data frame of median **HFTI per
    species**.

-   **Single-Function Workflow**\
    `compute_indices()` orchestrates all steps in one call:

    1.  Cleans occurrences via `process_occurrences()`.
    2.  Loads roads and human footprint raster (either toy data or full
        data - for Brazil).
    3.  Calculates RTI with `calculate_RTI()`.
    4.  Calculates HFTI with `calculate_HFTI()`.
    5.  Merges median RTI and HFTI per species into a single data frame.

    It returns a list containing:

    -   `RTI`: data frame of RTI per species.

    -   `HFTI`: data frame of HFTI per species.

    -   `indices`: combined data frame (`species` \| `RTI` \| `HFTI`)

    -   `processed_data`: the final `sf` with columns `RTI_value` and
        `HFTI_value` for each occurrence.

-   **Single-Function Workflow**\
    The function `generate_all_reports()` takes the output of
    `compute_indices()`, a shapefile of the study area, and an output
    directory. It automatically:

    1.  Creates a file `indices_por_especie.csv` (with columns species,
        RTI and HFTI).

    2.  For each unique species in processed_data, generates four PNG
        figures:

        -   Map of RTI (occurrence points colored by `RTI_value`).

        -   Map of HFTI (occurrence points colored by `HFTI_value`)

        -   Density plot of RTI (with a vertical line at the median).

        -   Density plot of HFTI (with a vertical line at the median).

        All files are saved in the user-specified `out_dir`.

-   **Use Internal or External Spatial Layers**\
    By default, `compute_indices()` calls internal helper functions
    `load_roads()` and `load_human_footprint()` to read example layers
    (Brazil roads + Brazil human footprint ot toy data). You can also
    supply your own shapefiles and rasters by "monkey-patching" those
    two functions in your script, so that no internal code of the
    package needs to be changed.

# Installation

To install the development version from your local repository, you can
use:

``` r
devtools::install("path/to/ecoTolerance")
```

Make sure you have the required packages installed (`sf`, `raster`,
`dplyr`, `stats`)

# 1. Quick Start Example (Toy Data)

By default, **ecoTolerance** runs in "Toy Mode". It uses a small,
lightweight dataset included in the package.

### 1.1 Load the package and Data

``` r
library(ecoTolerance)

# Sample occurrence data (in Bahia, where the Toy dataset is located)
sample_data <- data.frame(
  species   = c("SpA", "SpA", "SpB", "SpB"),
  longitude = c(-38.40, -38.41, -38.42, -38.43), 
  latitude  = c(-12.60, -12.61, -12.62, -12.63)
)
```

### 1.2 Compute Indices (Toy Mode)

The `compute_indices()` function uses `data_type= "toy"` by default.

``` r
result <- compute_indices(
  data = sample_data,
  remove_duplicates = TRUE,
  buffer_km = 1,
  ref_dist = 3.5,
  divisor = 50,
  data_type = "toy"  # Default
)

# Inspect results
head(result$indices)
```

### 1.3 Automatic Reports

To automatically save maps, plots, and CSVs for every species:

``` r
# 1) Shapefile of the study area (Optional, for context in the map)
# For this example, we create a dummy box, but you would load your shapefile.
library(sf)
study_area_sf <- st_as_sf(
    data.frame(id = 1),
    geometry = st_sfc(st_polygon(list(rbind(c(-39, -13), c(-38, -13), c(-38, -12), c(-39, -12), c(-39, -13))))),
    crs = 4326
)

# 2) Output directory
output_dir <- tempdir() # Or "C:/MyResults"

# 3) Generate all maps & plots
generate_all_reports(
  result         = result,
  area_shapefile = study_area_sf, # Can also be a path to a .shp file
  out_dir        = output_dir
)
```

### 2. Using Full Internal Data (Brazil)

To perform a real analysis covering the entire Brazilian territory, you
simply need to change the `data_type` argument to `"full"`.

When you run this for the first time, the package will automatically:

1.  Download the full Brazilian roads shapefile and Human Footprint
    raster from the Zenodo repository (\~60 MB).

2.  Save these files in a persistent cache on your computer
    (tools::R_user_dir).

3.  Load them for analysis.

Future runs will load directly from the cache, so you only download
once.

``` r
# Real occurrence data anywhere in Brazil
real_data <- data.frame(
  species   = c("Frog X", "Frog Y"),
  longitude = c(-45.32, -40.10),
  latitude  = c(-15.20, -18.50)
)

# Run with full dataset
# Note: The first time you run this, it will take a few minutes to download.
result_full <- compute_indices(
  data = real_data,
  data_type = "full",  # <--- This triggers the download/cache load
  ref_dist = 3.5
)
```

### 3. Using External Shapefiles & Rasters

If your study area is outside Brazil, or you have custom high-resolution
data, you can supply your own layers using the "monkey-patch" approach.

``` r
library(ecoTolerance)

# 1) Read your own data
occ_data <- read.csv("path/to/my_occurrences.csv")

# 2) Overwrite load_roads() and load_human_footprint() in your global environment.
# IMPORTANT: They must accept arguments (like 'type'), even if you don't use them.

load_roads <- function(type="toy") {
  # Argument 'type' is ignored here, we just return our custom file
  roads_sf <- sf::st_read("path/to/my_custom_roads.shp", quiet = TRUE)
  if (sf::st_crs(roads_sf)$epsg != 4326) {
    roads_sf <- sf::st_transform(roads_sf, crs = 4326)
  }
  return(roads_sf)
}

load_human_footprint <- function(type="toy") {
  footprint_raster <- raster::raster("path/to/my_custom_footprint.tif")
  return(footprint_raster)
}

# 3) Now call compute_indices()
# It will use your custom functions instead of the package internals
result_custom <- compute_indices(
  data = occ_data
)
```

### 4. Manual Workflow

If you prefer to run step-by-step instead of using `compute_indices()`,
here is how the package works internally.

#### 4.1 Process Occurrences

Run `process_occurrences()` to remove exact duplicates and optionally
filter out records closer than 1km:

``` r
processed_data <- process_occurrences(
  data            = sample_data,
  remove_duplicates = TRUE,
  buffer_km       = 1
)
```

-   `remove_duplicates = TRUE` removes identical coordinates for the
    same species.

-   `buffer_km = 1` ensures only one record per species within a 1 km
    radius.

`processed_data` is now an `sf` object in EPSG:4326.

#### 4.2 Calculating RTI and HFTI Directly

If you want to calculate RTI and HFTI in separate steps, you can do:

``` r
# 1) Load internal example layers (or full)
roads <- load_roads(type = "toy")
footprint <- load_human_footprint(type = "toy")

# 2) Calculate RTI
rti_result <- calculate_RTI(
    occurrences = processed_data, 
    roads = roads, 
    ref_dist = 3.5
)

# 3) Calculate HFTI
# Note: we use rti_result$data because it now contains the RTI values
hfti_result <- calculate_HFTI(
    occurrences = rti_result$data, 
    footprint = footprint, 
    divisor = 50, 
    buffer_m = 1000
)

# 4) Inspect the outputs
head(rti_result$RTI)   # median RTI per species
head(hfti_result$HFTI) # median HFTI per species
```

#### 4.3 Plotting Individual Maps

If you want to visualize the computed indices spatially, you can do:

``` r
library(ggplot2)

# For RTI:
ggplot(hfti_result$data) +
  geom_sf(aes(color = RTI_value)) +
  scale_color_viridis_c(name = "RTI") +
  theme_minimal() +
  ggtitle("Road Tolerance Index (RTI) per Occurrence")

# For HFTI:
ggplot(hfti_result$data) +
  geom_sf(aes(color = HFTI_value)) +
  scale_color_viridis_c(name = "HFTI") +
  theme_minimal() +
  ggtitle("Human Footprint Tolerance Index (HFTI) per Occurrence")
```

## Advanced Configuration

-   **Buffer distance** (`buffer_km`): adjust according to how
    fine_scaled your occurrence data is. A larger buffer removes points
    too close to each other withing that radius.

-   **Reference Distance** (`ref_dist`) for **RTI**: choose a value (in
    km) that makes sense biologically. Amphibians might need a larger
    reference distance (e.g. 3.5km) than plants (e.g. 1.5km) for
    instance.

-   **Divisor for HFTI**: set `divisor` for the maximum possible raster
    value. If your raster ranges 0-100, use `divisor = 100`. The default
    in the package is 50.

-   **CRS Consistency**: `calculate_RTI()` expects both occurrences and
    roads in EPSG:4326 (latitude/longitude). Internally, `st_distance()`
    does a geodetic calculation in meters. `calculate_HFTI()` assumes
    the human footprint raster is in a projected CRS (e.g. UTM) or at
    least geographic CRS in meters. If in degrees, the buffer will be
    interpreted in degrees (not recommended). The vignette examples
    always reproject to 4326 if necessary, but ideally your raster
    should already be in a metric projection.

# Conclusion

The **ecoTolerance** package streamlines the process of computing road
and human footprint tolerance indices for ecological and conservation
studies. By combining robust data processing with reproducible spatial
calculations, it aids researchers in understanding species' responses to
anthropogenic disturbances.

> **Citation / Reference**
>
> -   Please cite this package or the relevant paper if it is part of
>     your paper.
>
> -   For academic citation, you can use:

``` r
citation("ecoTolerance")
```

# Acknowledgments

-   We thank all the collaborators and advisors involved in this
    research.

-   This package is part of an ongoing PhD thesis at *Programa de
    Pós-Graduação em Ecologia: Teoria, Aplicação e Valores of
    Universidade Federal da Bahia*.

# References

-   MapBiomas. *Coleção de Dados do Módulo de Infraestrutura.* 2022.
-   Miranda, D. F., Forti, L. R. (in preparation). *ecoTolerance: An R
    Package for Assessing Road and Human Footprint Tolerance in Wildlife
    Species*.
-   Mu, H. et al. *A Global Record of Annual Terrestrial Human Footprint
    Dataset From 2000 to 2018.* Scientific Data, 9(1), 176.