--- title: "Stationary FFA" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Stationary FFA} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This vignette demonstrates how to use the `ffaframework` package to perform flood frequency analysis (model selection, parameter estimation, uncertainty quantification, and model performance assessment) under the assumption of stationarity. The results are expressed in terms of *return periods* and *return levels*. The *return period* of a flood is the expected number of years between annual maximum events of equal or greater magnitude while the *return level* is the magnitude of the annual maximum event, measured in $\text{m}^3/\text{s}$. ## Case Study This vignette will explore data from the Athabasca River at Athabasca (07BE001), an unregulated hydrological monitoring station. A statistical analysis of the data from this station does not reveal any evidence of trends in the mean or variability of the data. Data for this station is provided as `CAN-07BE001.csv` in the `ffaframework` package. ```{r, fig.width = 10, fig.height = 8, fig.align = "center", out.width = "100%"} library(ffaframework) df <- data_local("CAN-07BE001.csv") head(df) plot_ams_data(df$max, df$year, title = "Athabasca River at Athabasca (07BE001)") ``` ## Distribution Selection First, a suitable probability distribution for the data is selected using the L-moments. 1. `select_ldistance` chooses the distribution with theoretical L-skewness ($\tau_3$) and L-kurtosis ($\tau_4$) closest to the sample, based on Euclidean distance. 2. `select_lkurtosis` selects the distribution with the smallest difference between sample and theoretical L-kurtosis (for three-parameter distributions only). 3. `select_zstatistic` uses a fitted 4-parameter Kappa distribution to estimate the sampling distribution of the L-kurtosis and selects the distribution with the smallest z-statistic. ```{r, fig.width = 10, fig.height = 8, fig.align = "center", out.width = "100%"} selection <- select_ldistance(df$max) print(selection$recommendation) plot_lmom_diagram(selection) ``` **Recommendation**: Use the generalized extreme value (GEV) distribution. **Note**: For information about the other distributions, see the `selection$metrics` item. You can find more information about the probability distributions supported by the framework [here](https://rileywheadon.github.io/ffa-framework/articles/probability-distributions.html). ## Parameter Estimation The `ffaframework` package provides three methods for parameter estimation. See [here](https://rileywheadon.github.io/ffa-framework/articles/parameter-estimation.html) for more information. - `fit_lmoments`: L-moments parameter estimation. - `fit_mle`: Maximum likelihood parameter estimation. - `fit_gmle`: Generalized maximum likelihood parameter estimation. ```{r, fig.width = 10, fig.height = 8, fig.align = "center", out.width = "100%"} fit <- fit_lmoments(df$max, "GEV") print(fit$params) plot_sffa_fit(fit) ``` **Conclusion**: The GEV distribution with `location = 1600`, `scale = 616`, and `shape = 0.12` will be used. ## Uncertainty Quantification Once a probability distribution is fitted, return levels (design events) can be readily estimated. However, point estimates alone can be misleading --it is more informative to report confidence intervals to reflect estimation uncertainty. The `uncertainty_bootstrap` function performs uncertainty quantification using the parametric bootstrap method. It requires three arguments: - `data`: A vector of annual maximum series data. - `model`: A three-letter code for a probability distribution (ex. `"GEV"`). - `method`: A parameter estimation method. Must be `"L-moments"`, `"MLE"`, or `"GMLE"`. By default, return levels are computed 2-, 5-, 10-, 20-, 50-, and 100- year return periods. ```{r, fig.width = 10, fig.height = 8, fig.align = "center", out.width = "100%"} uncertainty <- uncertainty_bootstrap(df$max, "GEV", "L-moments") plot_sffa_estimates(fit, ci = uncertainty$ci) ``` **Example Conclusion**: Every 10 years, we can expect a flood of roughly $3{\small,}200\text{m}^3/\text{s}$ or greater. ## Model Assessment Model performance is assessed using `model_assessment`, which reports a collection of assessment statistics about the flood frequency analysis. `plot_sffa_assessment` compares the data (the "empirical quantiles") and the predictions of the parametric model at the plotting positions (the "theoretical quantiles"). The black line represents a perfect 1:1 correspondence between the model and the data. ```{r, fig.width = 10, fig.height = 8, fig.align = "center", out.width = "100%"} assessment <- model_assessment(df$max, "GEV", "L-moments") plot_sffa_assessment(assessment) ``` **Conclusion**: The parametric model generally matches the plotting positions. The model slightly underestimates the data from $3{\small,}500\text{m}^3/\text{s}$ to $5{\small,}000\text{m}^3/\text{s}$.