---
title: "Functional stability metrics"
output: 
  rmarkdown::html_vignette:
    toc: TRUE
    toc_depth: 2 
    fig_caption: TRUE
vignette: >
  %\VignetteIndexEntry{temporal_metrics}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
bibliography: references.bib
editor_options: 
  markdown: 
    wrap: 72
---

```{r, echo=FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
options(rmarkdown.html_vignette.check_title = FALSE)
set.seed(123)
```

```{r, setup, include=FALSE}
library(estar)
library(tidyr)
library(dplyr)
library(ggplot2)
library(cowplot)
library(viridis)
library(forcats)
library(kableExtra)
library(vegan)
source("custom_aesthetics.R")

label_intensity <- function(intensity, mu) {
  paste0("Conc = ", intensity, " micro g/L")
}
```

# Overview of the data

We illustrate the use of `estar` functions on a dataset from an
ecotoxicological study about the effects of the chlorpyrifos insecticide
on a community of aquatic macroinvertebrates [@van_den_brink_effects_1996; @wijngaarden_effects_1996]. The insecticide negatively affects freshwater
macroinvertebrates, and, to a lesser degree, zooplankton species. This insecticide was applied at four
different concentrations (0.1, 0.9, 6, and 44 microg/ L), with two
replicates per concentration level. Additionally, four replicates were
used as control, undisturbed systems ('baseline' in `estar`
terminology). In response to a pulse disturbance (Glossary in main
text), @van_den_brink_effects_1996 reports decreased and
destabilized community diversity, but no effects on gross primary
production nor respiration. The community is composed of 128 species,
classified into 5 functional groups: herbivores, detriti-herbivores,
carnivores, omnivores, and detritivores.

```{r echo = FALSE, message = FALSE,  fig.height = 6, fig.width = 8, fig.cap = "Organism mean count (log axis) of the five functional groups over the course of ecotoxicological experiment. The vertical dashed line identifies the time when the insecticide was applied (at week 0), and facets represent the different concentrations (micro g/L)."}
(
  aquacomm_resps.logplot <- aquacomm_fgps |>
    tidyr::pivot_longer(
      cols = c(herb, detr_herb, carn, omni, detr),
      names_to = "gp",
      values_to = "abund"
    ) |>
    ## summarize to plot
    dplyr::group_by(time, treat, gp) |>
    dplyr::summarize(abund_logmean = log(mean(abund))) |>
    dplyr::ungroup() |>
    dplyr::mutate(
      gp = forcats::fct_recode(
        factor(gp),
        Herbivore = "herb",
        Detr_Herbivore = "detr_herb",
        Carnivore = "carn",
        Omnivore = "omni",
        Detritivore = "detr"
      )
    ) |>
    ggplot(aes(
      x = time,
      y = abund_logmean,
      group = gp,
      colour = gp
    )) +
    geom_line() +
    geom_point(size = 1) +
    geom_vline(
      aes(xintercept = 0),
      linetype = 2,
      colour = "black"
    ) +
    scale_colour_manual(values = gp_colours_full) +
    facet_wrap(
      ~ treat,
      nrow = 5,
      labeller = labeller(treat = label_intensity)
    ) +
    labs(x = "Time (week)", y = "Mean abundance (log)", colour = "Functional\ngroup") +
    estar:::theme_estar()
)
```

```{r echo=FALSE, message = FALSE, fig.height = 6, fig.width = 8, fig.cap="Organism count (mean +- sd in grey) of the five functional groups over the course of ecotoxicological experiment. The vertical line identifies the time when the insecticide was applied (at week 0), and facets represent the different concentrations (micro g/L)."}
(
  aquacomm_resps.rawplot <- aquacomm_fgps |>
    tidyr::pivot_longer(
      cols = c(herb, detr_herb, carn, omni, detr),
      names_to = "gp",
      values_to = "abund"
    ) |>
    ## summarize to plot
    dplyr::group_by(time, treat, gp) |>
    dplyr::summarize(abund.mean = mean(abund), abund.sd = sd(abund)) |>
    dplyr::ungroup() |>
    dplyr::mutate(
      gp = forcats::fct_recode(
        factor(gp),
        Herbivore = "herb",
        Detr_Herbivore = "detr_herb",
        Carnivore = "carn",
        Omnivore = "omni",
        Detritivore = "detr"
      )
    ) |>
    ggplot(aes(
      x = time,
      y = abund.mean,
      group = gp,
      colour = gp
    )) +
    geom_ribbon(
      aes(ymin = abund.mean - abund.sd, ymax = abund.mean + abund.sd),
      col = "gainsboro",
      fill = "gainsboro"
    ) +
    geom_line() +
    geom_point(size = 1) +
    geom_vline(
      aes(xintercept = 0),
      linetype = 2,
      colour = "black"
    ) +
    scale_colour_manual(values = gp_colours_full) +
    facet_wrap(
      ~ treat,
      nrow = 5,
      labeller = labeller(treat = label_intensity)
    ) +
    labs(x = "Time (week)", y = "Mean abundance", colour = "Functional\ngroup") +
    estar:::theme_estar()
)
```

# Main functions

Examples of each function applied to the example dataset. For the
functions that require a user-defined time step when were the system has
recovered, we use week 28, when all groups seem to have stabilized their
growth curve (Fig.1).

## Functional stability

### Invariability

Invariability calculated as the inverse of standard deviation of
residuals of the linear model that predicts the log response ratio of
the state variable in the disturbed and baseline systems by time. The
baseline system in our example dataset is reflected by control ditches,
to which no pesticide was applied. The time frame is defined by `tb_i`
and specified in the data frame `d_data` (`aquacomm_resps`, Fig. 2-a).

```{r}
invariability(
  type = "functional",
  mode = "lm_res",
  response = "lrr",
  metric_tf = c(1, max(aquacomm_resps$time)),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Invariability calculated as the inverse of standard deviation of
residuals of the linear model fitted to predict the state variable in
the disturbed system by time (Fig. 2-b).

```{r}
invariability(
  type = "functional",
  mode = "lm_res",
  metric_tf = c(1, max(aquacomm_resps$time)),
  response = "v",
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Invariability calculated as the inverse of the coefficient of variation
of the log response ratio of the state variable in the disturbed and
baseline systems (Fig. 2-c).

```{r}
invariability(
  type = "functional",
  response = "lrr",
  mode = "cv",
  metric_tf = c(1, max(aquacomm_resps$time)),
  vd_i = "statvar_db",
  td_i = "time",
  b_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  d_data = aquacomm_resps
)
```

Invariability calculated as the inverse of the coefficient of variation
of a state variable in the disturbed system (Fig. 2-d).

```{r}
invariability(
  type = "functional",
  response = "v",
  mode = "cv",
  metric_tf = c(1, max(aquacomm_resps$time)),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

```{r invar_schem, echo=FALSE, fig.height = 4, fig.cap="Figure 2: Schematic representation of the possible metrics of invariability available in the estar package. 'lrr' refers to the log response ratio of the state variable in the disturbed system compared to the baseline time-series, and 'v' to 'state variable' in the disturbed system. a) and b) demonstrate calculation of invariability based on the standard deviation of the residuals of the fitted linear model, whereby the linear model fit is shown by a blue solid line. c) and d) demonstrate calculation of invariability based on the coefficient of variation", out.width = '100%'}
knitr::include_graphics("figures/invariability.png")
```

### Resistance

Resistance calculated in relation to a baseline time series
(`b = "input"`), as the log response ratio (`res_mode = "lrr"`) of the
state variable in the disturbed and baseline systems at the first time
step following disturbance (`res_time = "defined"`, `res_t = 1`, Fig.
3-a).

```{r}
resistance(
  type = "functional",
  b = "input",
  res_mode = "lrr",
  res_time = "defined",
  res_t = 1,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Resistance calculated in relation to a baseline time series
(`b = "input"`), as the highest (`res_time = "max"`) log response ratio
(`res_mode = "lrr"`) of the state variable in the disturbed and baseline
systems during a given time frame (`res_tf = c(1, 20)`, Fig. 3-b).

```{r}
resistance(
  type = "functional",
  b = "input",
  res_mode = "lrr",
  res_time = "max",
  res_tf = c(1, 20),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Resistance calculated in relation to a baseline time series
(`b = "input"`), as the difference (`res_mode = "diff"`) at the first
time step following disturbance (`res_time = "defined"`, `res_t = 1`,
Fig. 3-c).

```{r}
resistance(
  type = "functional",
  b = "input",
  res_mode = "diff",
  res_time = "defined",
  res_t = 1,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Resistance calculated in relation to a baseline time series
(`b = "input"`), as the highest (`res_time = "max"`) absolute difference
(`res_mode = "diff"`) during a given time frame (`res_tf = c(11, 50)`,
Fig. 3-d).

```{r}
resistance(
  type = "functional",
  b = "input",
  res_mode = "diff",
  res_time = "max",
  res_tf = c(1, 20),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Resistance calculated in relation to a pre-disturbance baseline
(`b = "d"`), composed of the state variable values in the three time
steps before disturbance (`b_tf = c(-4, -0.14)`), as the highest
(`res_time = "max"`) absolute difference (`res_mode = "diff"`) during a
given time frame (`res_tf = c(11, 50)`, Fig. 3-d).

```{r}
resistance(
  type = "functional",
  b = "d",
  b_tf = c(-4, 0.14),
  res_time = "max",
  res_mode = "diff",
  res_tf = c(1, 20),
  vd_i = "statvar_bl",
  td_i = "time",
  d_data = aquacomm_resps
)
```

Resistance calculated in relation to a pre-disturbance baseline
(`b = "d"`), composed of the state variable values in the three time
steps before disturbance (`b_tf = c(-4, -0.14)`), as the highest
(`res_time = "defined"`) absolute log-ratio (`res_mode = "lrr"`) at a
precise time step (`res_t = 1)`, Fig. 3-d).

```{r}
resistance(
  type = "functional",
  b = "d",
  b_tf = c(-4, 0.14),
  res_mode = "lrr",
  res_time = "defined",
  res_t = 1,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps
)
```

```{r resistance_schem, fig.height = 4, echo=FALSE, fig.cap="Figure 3: Schematic representation of the possible metrics of resistance available in the estar package. 'lrr' and 'diff' refer, respectively, to the log response ratio and to the difference between the state variable in the disturbed system in relation to the baseline value(s), 'v' to 'state variable', 'td+1', to the first time step after disturbance, and 'tlow', to the time step of the highest response in relation to the baseline.", out.width = '100%'}
knitr::include_graphics("figures/resistance.png")
```

### Extent of recovery

Extent of recovery calculated as the log response ratio
(`response = "lrr"`) of the state variable in the disturbed and baseline
systems (`b = "input"`), at a user-defined time step (`t_rec = 28`, Fig.
4-a).

```{r}
recovery_extent(
  type = "functional",
  response = "lrr",
  b = "input",
  t_rec = 28,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
) 
```

Extent of recovery calculated as the difference (`response = "lrr"`)
between the state variables in a disturbed time-series and the baseline
(`b = "input"`), at the predefined time step (`t_rec = 28`, Fig. 4-b).

```{r}
recovery_extent(
  type = "functional",
  response = "diff",
  b = "input",
  t_rec = 28,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Extent of recovery calculated as the log response ratio
(`response = "lrr"`) of the state variable in the disturbed time-series
in relation to the state variable pre-disturbance (`b = "d"`),
summarized by the mean value (`summ_mode = "mean"` - default, over a
period `b_tf = c(5, 10)`). The extent of recovery is calculated at the
point we understand all groups have stabilized (`t_rec = 28`).

```{r}
recovery_extent(
  type = "functional",
  response = "lrr",
  b = "d",
  summ_mode = "mean",
  b_tf = c(5, 10),
  t_rec = 28,
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps
)
```

```{r echo=FALSE, fig.height = 4, fig.cap="Figure 4: Schematic representation of the possible metrics of extent of recovery available in the estar package. 'lrr' refers to the log response ratio of the state variable in the disturbed system compared to the baseline time-series, 'diff' to the difference between them, and 'tpost', to a post-disturbance time step, specified by the user.", out.width = '100%'}
knitr::include_graphics("figures/extent_recovery.png")
```

### Rate of recovery

Rate of recovery calculated as the slope of the linear model which
predicts the log response ratio of the state variable in the disturbed
system compared to the baseline (`b = "input`) by time (over a time
frame (`metric_tf = c(1, 28)`, Fig. 5-a).

```{r}
recovery_rate(
  type = "functional",
  response = "v",
  b = "input",
  metric_tf = c(1, 28),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Rate of recovery calculated as the slope of the linear model which
predicts the values of the state variable in the disturbed system
(`response = "v"`) by time (over a time frame (`metric_tf = c(1, 28)`, Fig.
5-b).

```{r}
recovery_rate(
  type = "functional",
  response = "v",
  b = "input",
  metric_tf = c(1, 28),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

```{r echo=FALSE, fig.height = 4, fig.cap="Figure 5: Schematic representation of the possible metrics of rate of recovery available in the estar package. 'lrr' refers to the log response ratio of the state variable in the disturbed system compared to the baseline time-series, 'v' to 'state variable'. The blue solid line identifies the linear model fitted to predict the response from time (along with its equation) and the double-headed arrow identifies the slope, which is the measure of rate of recovery", out.width = '100%'}
knitr::include_graphics("figures/rate_recovery.png")
```

### Persistence

Proportion of time over which the system persisted as defined by it
being within 1 standard deviation from the mean of the state variable
values in an independent baseline (`b = "input"`) during the same time
period for which persistence is measured
(`metric_tf = c(28, max(aquacomm_resps$time))`, Fig. 6).

```{r}
persistence(
  type = "functional",
  b = "input",
  metric_tf = c(28, max(aquacomm_resps$time)),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```

Proportion of time over which the system persisted as defined by it
being within 1 standard deviation from the mean of the state variable
values during a pre-disturbance (`b = "d"`) time period
(`b_tf = c(-4, 0.14)`). Persistence is measured for the post-disturbance
time period (`metric_tf = c(1, 60)`, Fig. 6).

```{r}
persistence(
  type = "functional",
  b = "d",
  b_tf = c(-4, 0.14),
  metric_tf = c(28, max(aquacomm_resps$time)),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps
)
```

```{r echo=FALSE, out.width = '50%',vfig.cap="Figure 6: Schematic representation of the possible metrics of persistence available in the estar package. 'v' refers to 'state variable', 'lim_u' and 'lim_l' are the upper and lower limits defined as +1 and -1 standard deviation of the baseline state variable around the mean of the baseline state variable. 'ta' is the period for which persistence is calculated for, and 'tP', the period during which the state variable remained inside the limits."}

knitr::include_graphics("figures/persistence.png")
```

### Overall Ecological Vulnerability

Area under the curve of the log response ratio (`response = "lrr"`) between the state variable in the disturbed and baseline scenarios, since shortly after the disturbance (we don't have data from t = 0, the moment of application of the insecticide) until the the end of the observation period (`metric_tf = c(0.14, 56)`).

```{r}
oev(
  type = "functional",
  response = "lrr",
  metric_tf = c(0.14, 56),
  vd_i = "statvar_db",
  td_i = "time",
  d_data = aquacomm_resps,
  vb_i = "statvar_bl",
  tb_i = "time",
  b_data = aquacomm_resps
)
```
```{r echo=FALSE, out.width = '50%',vfig.cap="Figure 7: Schematic representation of the calculation of Overall Ecological Stability in estar. 'lrr' refers to the log response ratio of the state variable in the disturbed system compared to the baseline time-series, 'td' is the time step where disturbance happened. The purple shaded area is the area under the curve calculated by the metric."}

knitr::include_graphics("figures/oev.png")
```
## Compositional stability

The functions for functional stability can be used to calculate the stability of community composition from community compositional data. To do so, the functions call the `vegdist` function (from the `vegan` package), which can be parameterized with the `method` and `binary` arguments.

```{r}
# Reformatting the macroinvertebrate community long-format data into 
# community composition data
comm_data <- aquacomm_fgps |>
    dplyr::group_by(time, treat) |>
    dplyr::filter(treat %in% c(0, 44)) |>
    dplyr::summarize_at(vars(herb, detr_herb, carn, omni, detr),
                        mean) |>
    dplyr::ungroup()

control_comm <- comm_data |>
        dplyr::filter(treat == 0) |>
        dplyr::select(-treat)
dist_comm <- comm_data |>
        dplyr::filter(treat == 44) |>
        dplyr::select(-treat)
```

It is worth noting that, if the user wants to calculate compositional stability from other metrics, they can simply input it as a single variable time-series, as demonstrated in the section "Functional stability".

### Invariability

```{r eval=FALSE}
invariability(
  type = "compositional",
  metric_tf = c(0.14, 56),
  comm_d = dist_comm,
  comm_b = control_comm,
  comm_t = "time")
```

### Resistance

Maximal resistance (`res_time = "max"`) over a time period defined by the user (`res_tf = c(0.14, 28)`).
```{r}
resistance(type = "compositional",
          res_tf = c(0.14, 28),
          res_time = "max",
          comm_d = dist_comm,
          comm_b = control_comm,
          comm_t = "time")
```

Resistance at a time step chosen by the user (`res_time = "defined"`, `res_t = 28,`).
```{r eval=FALSE}
resistance(type = "compositional",
          res_time = "defined",
          res_t = 28,
          comm_d = dist_comm,
          comm_b = control_comm,
          comm_t = "time")
```

### Rate of recovery
```{r}
recovery_rate(type = "compositional",
              metric_tf = c(0.14, 28),
              comm_d = dist_comm,
              comm_b = control_comm,
              comm_t = "time")
```
 
### Extent of recovery

```{r}
recovery_extent(type = "compositional",
          t_rec = 28,
          comm_d = dist_comm,
          comm_b = control_comm,
          comm_t = "time")
```

### Persistence

```{r}
persistence(type = "compositional",
            b = "input",
            metric_tf = c(28, 56),
            comm_d = dist_comm,
            comm_b = control_comm,
            comm_t = "time",
            low_lim = 0.5,
            high_lim = 0.9)
```

### Overall Ecological Vulnerability

```{r}
oev(type = "compositional",
    metric_tf = c(0.14, 56),
    comm_d = dist_comm,
    comm_b = control_comm,
    comm_t = "time")
```

# Performance

We conducted the benchmark analysis of the execution time of the different forms of calculating the temporal metrics. The number attached to the function name in "Function calls" identifies the order in which the function call appears in the vignette. The code used for the analysis is available in `performance_analysis.R`.

While the time to run the calls can differ significantly from each other, the execution time did not exceed 0.01 seconds for our example.

```{r echo=FALSE}
load("functional_performance.rda")
```

## Invariability

```{r}
df_list$inv_benchmark %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `invariability()` function demonstrated in the vignette.](figures/inv_benchmark.png){width=100%}

## Resistance

```{r}
df_list$resis_benchmark %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `resistance()` function demonstrated in the vignette.](figures/resis_benchmark.png){width=100%}

## Recovery rate

```{r}
df_list$rate_benchmark %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `recovery_rate()` function demonstrated in the vignette.](figures/rate_benchmark.png){width=100%}

## Recovery extent

```{r}
df_list$extent %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `recovery_extent()` function demonstrated in the vignette.](figures/extent_benchmark.png){width=100%}

## Persistence

```{r}
df_list$persist_benchmark %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `persistence()` function demonstrated in the vignette.](figures/persist_benchmark.png){width=100%}

## Overall Ecological Vulnerability

```{r}
df_list$oev_benchmark %>%
  dplyr::select(-neval) %>% 
  kableExtra::kbl() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
```

![Density distribution of execution times of 100 runs of each of the different options of the `oev()` function demonstrated in the vignette.](figures/oev_benchmark.png){width=100%}

------------------------------------------------------------------------

# References