Type: | Package |
Title: | Handle the FORCIS Foraminifera Database |
Version: | 1.0.1 |
Description: | Provides an interface to the 'FORCIS' database (Chaabane et al. (2024) <doi:10.5281/zenodo.7390791>) on global foraminifera distribution. This package allows to download and to handle 'FORCIS' data. It is part of the FRB-CESAB working group FORCIS. https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/forcis/. |
URL: | https://docs.ropensci.org/forcis/, https://github.com/ropensci/forcis |
BugReports: | https://github.com/ropensci/forcis/issues |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
Depends: | R (≥ 4.1.0) |
Imports: | ggplot2, httr2, rlang, sf, tibble, tidyr, utils, vroom |
Suggests: | fs, dplyr, httptest2, knitr, rmarkdown, testthat (≥ 3.0.0), vdiffr, withr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-05-20 09:26:48 UTC; nicolas |
Author: | Nicolas Casajus |
Maintainer: | Nicolas Casajus <nicolas.casajus@fondationbiodiversite.fr> |
Repository: | CRAN |
Date/Publication: | 2025-05-23 12:02:02 UTC |
forcis: Handle the FORCIS Foraminifera Database
Description
Provides an interface to the 'FORCIS' database (Chaabane et al. (2024) doi:10.5281/zenodo.7390791) on global foraminifera distribution. This package allows to download and to handle 'FORCIS' data. It is part of the FRB-CESAB working group FORCIS. https://www.fondationbiodiversite.fr/en/the-frb-in-action/programs-and-projects/le-cesab/forcis/.
Author(s)
Maintainer: Nicolas Casajus nicolas.casajus@fondationbiodiversite.fr (ORCID) [copyright holder]
Authors:
Mattia Greco mattia_greco@outlook.com (ORCID)
Sonia Chaabane sonia.chaabane@gmail.com (ORCID)
Xavier Giraud giraud@cerege.fr (ORCID)
Thibault de Garidel-Thoron garidel@cerege.fr (ORCID)
Other contributors:
Khalil Hammami khalil.hammami@enetcom.usf.tn [contributor]
Air Forbes amforbes@ualberta.ca (ORCID) [reviewer]
FRB-CESAB [funder]
See Also
Useful links:
Report bugs at https://github.com/ropensci/forcis/issues
Compute count conversions
Description
Functions to convert species counts between different formats: raw abundance, relative abundance, and number concentration, using counts metadata.
Usage
compute_abundances(data, aggregate = TRUE)
compute_concentrations(data, aggregate = TRUE)
compute_frequencies(data, aggregate = TRUE)
Arguments
data |
a |
aggregate |
a |
Details
-
compute_concentrations()
converts all counts to number concentrations (n specimens/m³). -
compute_frequencies()
converts all counts to relative abundances (% specimens per sampling unit). -
compute_abundances()
converts all counts to raw abundances (n specimens/sampling unit).
Value
A tibble
in long format with two additional columns: taxa
,
the taxon name and counts_*
, the number concentration (counts_n_conc
) or
the relative abundance (counts_rel_ab
) or the raw abundance
(counts_raw_ab
).
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Select a taxonomy ----
net_data <- select_taxonomy(net_data, taxonomy = "VT")
# Dimensions of the data.frame ----
dim(net_data)
# Compute concentration ----
net_data_conc <- compute_concentrations(net_data)
# Dimensions of the data.frame ----
dim(net_data_conc)
Reshape and simplify FORCIS data
Description
Reshapes FORCIS data by pivoting species columns into two columns: taxa
(taxon names) and counts
(taxon abundances). It converts wider data.frame
to a long format.
Usage
convert_to_long_format(data)
Arguments
data |
a |
Value
A tibble
reshaped in a long format.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Reshape data ----
net_data <- convert_to_long_format(net_data)
# Dimensions of the data.frame ----
dim(net_data)
# Column names ----
colnames(net_data)
Convert a data frame into an sf object
Description
This function can be used to convert a data.frame
into an sf
object.
Note that coordinates (columns site_lon_start_decimal
and
site_lat_start_decimal
) are projected in the Robinson coordinate system.
Usage
data_to_sf(data)
Arguments
data |
a |
Value
An sf POINTS
object.
Examples
# Attach package ----
library("ggplot2")
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Filter by years ----
net_data_sub <- filter_by_year(net_data, years = 1992)
# Convert to an sf object ----
net_data_sub_sf <- data_to_sf(net_data_sub)
# World basemap ----
ggplot() +
geom_basemap() +
geom_sf(data = net_data_sub_sf)
Download the FORCIS database
Description
Downloads the entire FORCIS database as a collection of five csv
files from
Zenodo (https://zenodo.org/doi/10.5281/zenodo.7390791). Additional
files will be also downloaded.
Usage
download_forcis_db(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update,
overwrite = FALSE,
timeout = 60
)
Arguments
path |
a |
version |
a |
check_for_update |
a |
overwrite |
a |
timeout |
an |
Details
The FORCIS database is regularly updated. The global structure of the tables
doesn’t change between versions but some bugs can be fixed and new records
can be added. This is why it is recommended to use the latest version of the
database. The package is designed to handle the versioning of the database on
Zenodo and will inform the user if a new version is available each time
he/she uses one of the read_*_data()
functions.
For more information, please read the vignette available at https://docs.ropensci.org/forcis/articles/database-versions.html.
Value
No return value. The FORCIS files will be saved in the path
folder.
References
Chaabane S, De Garidel-Thoron T, Giraud X, et al. (2023) The FORCIS database: A global census of planktonic Foraminifera from ocean waters. Scientific Data, 10, 354. DOI: doi:10.1038/s41597-023-02264-2.
See Also
read_plankton_nets_data()
to import the FORCIS database.
Examples
# Folder in which the database will be saved ----
# N.B. In this example we use a temporary folder but you should select an
# existing folder (for instance "data/").
path <- tempdir()
# Download the database ----
download_forcis_db(path, timeout = 300)
# Check the content of the folder ----
list.files(path, recursive = TRUE)
Filter FORCIS data by a spatial bounding box
Description
Filters FORCIS data by a spatial bounding box.
Usage
filter_by_bbox(data, bbox)
Arguments
data |
a |
bbox |
an object of class |
Value
A tibble
containing a subset of data
for the desired bounding
box.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Filter by oceans ----
net_data_sub <- filter_by_bbox(net_data, bbox = c(45, -61, 82, -24))
# Dimensions of the data.frame ----
dim(net_data_sub)
Filter FORCIS data by month of sampling
Description
Filters FORCIS data by month of sampling.
Usage
filter_by_month(data, months)
Arguments
data |
a |
months |
a |
Value
A tibble
containing a subset of data
for the desired months.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Filter by months ----
net_data_sub <- filter_by_month(net_data, months = 1:2)
# Dimensions of the data.frame ----
dim(net_data_sub)
Filter FORCIS data by ocean
Description
Filters FORCIS data by one or several oceans.
Usage
filter_by_ocean(data, ocean)
Arguments
data |
a |
ocean |
a |
Value
A tibble
containing a subset of data
for the desired oceans.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Get ocean names ----
get_ocean_names()
# Filter by oceans ----
net_data_sub <- filter_by_ocean(net_data, ocean = "Indian Ocean")
# Dimensions of the data.frame ----
dim(net_data_sub)
Filter FORCIS data by a spatial polygon
Description
Filters FORCIS data by a spatial polygon.
Usage
filter_by_polygon(data, polygon)
Arguments
data |
a |
polygon |
an |
Value
A tibble
containing a subset of data
for the desired spatial
polygon.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Import Indian Ocean spatial polygons ----
file_name <- system.file(file.path("extdata",
"IHO_Indian_ocean_polygon.gpkg"),
package = "forcis")
indian_ocean <- sf::st_read(file_name)
# Filter by polygon ----
net_data_sub <- filter_by_polygon(net_data, polygon = indian_ocean)
# Dimensions of the data.frame ----
dim(net_data_sub)
Filter FORCIS data by species
Description
Filters FORCIS data by a species list.
Usage
filter_by_species(data, species)
Arguments
data |
a |
species |
a |
Value
A tibble
containing a subset of data
.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Select a taxonomy ----
net_data <- select_taxonomy(net_data, taxonomy = "VT")
# Select only required columns (and taxa) ----
net_data <- select_forcis_columns(net_data)
# Dimensions of the data.frame ----
dim(net_data)
# Get species names ----
get_species_names(net_data)
# Select records for three species ----
net_data_sub <- filter_by_species(data = net_data,
species = c("g_inflata_VT",
"g_elongatus_VT",
"g_glutinata_VT"))
# Dimensions of the data.frame ----
dim(net_data_sub)
# Get species names ----
get_species_names(net_data_sub)
Filter FORCIS data by year of sampling
Description
Filters FORCIS data by year of sampling.
Usage
filter_by_year(data, years)
Arguments
data |
a |
years |
a |
Value
A tibble
containing a subset of data
for the desired years.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Filter by years ----
net_data_sub <- filter_by_year(net_data, years = 1992)
# Dimensions of the data.frame ----
dim(net_data_sub)
Add a World basemap to a ggplot object
Description
Creates a World base map that can be added to a ggplot
object.
Spatial layers come from the Natural Earth project
(https://www.naturalearthdata.com/) and are defined in the Robinson
coordinate system.
Usage
geom_basemap()
Value
A ggplot
object.
Examples
# Attach package ----
library("ggplot2")
# World basemap ----
ggplot() +
geom_basemap()
Get available versions of the FORCIS database
Description
Gets all available versions of the FORCIS database by querying the Zenodo API (https://developers.zenodo.org).
Usage
get_available_versions()
Value
A tibble
with three columns:
-
publication_date
: the date of the release of the version -
version
: the label of the version -
access_right
: is the version open or restricted?
Examples
# Versions of the FORCIS database ----
get_available_versions()
Get World ocean names
Description
This function returns the name of World oceans according to the IHO Sea Areas dataset version 3 (Flanders Marine Institute, 2018).
Usage
get_ocean_names()
Value
A character
vector with World ocean names.
References
Flanders Marine Institute (2018). IHO Sea Areas, version 3. Available online at: https://www.marineregions.org/. DOI: doi:10.14284/323.
Examples
# Print the name of World oceans ----
get_ocean_names()
Get required column names
Description
Gets required column names (except taxa names) for the package. This
function is designed to help users to add additional columns in
select_forcis_columns()
(argument cols
) if missing from this list.
These columns are required by some functions (compute_*()
, plot_*()
,
etc.) of the package and shouldn't be deleted.
Usage
get_required_columns()
Value
A character
vector.
Examples
# Get required column names (expect taxa names) ----
get_required_columns()
Get species names from column names
Description
Gets species names from column names. This function is just an utility to easily retrieve taxon names.
Usage
get_species_names(data)
Arguments
data |
a |
Value
A character
vector of species names.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Select a taxonomy ----
net_data <- select_taxonomy(net_data, taxonomy = "VT")
# Retrieve taxon names ----
get_species_names(net_data)
Print information of a specific version of the FORCIS database
Description
Prints information of a specific version of the FORCIS database by querying the Zenodo API (https://developers.zenodo.org).
Usage
get_version_metadata(version = NULL)
Arguments
version |
a |
Value
A list
with all information about the version, including: title
,
doi
, publication_date
, description
, access_right
, creators
,
keywords
, version
, resource_type
, license
, and files
.
Examples
# Get information for the latest version of the FORCIS database ----
get_version_metadata()
Map the spatial distribution of FORCIS data
Description
Maps the spatial distribution of FORCIS data.
Usage
ggmap_data(data, col = "red", ...)
Arguments
data |
a |
col |
a |
... |
other graphical parameters passed on to |
Value
A ggplot
object.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Map data (default) ----
ggmap_data(net_data)
# Map data ----
ggmap_data(net_data, col = "black", fill = "red", shape = 21, size = 2)
Plot sample records by depth of collection
Description
This function produces a barplot of FORCIS sample records by depth.
Usage
plot_record_by_depth(data)
Arguments
data |
a |
Value
A ggplot
object.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Plot data by year (example dataset) ----
plot_record_by_depth(net_data)
Plot sample records by month
Description
This function produces a barplot of FORCIS sample records by month.
Usage
plot_record_by_month(data)
Arguments
data |
a |
Value
A ggplot
object.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Plot data by year (example dataset) ----
plot_record_by_month(net_data)
Plot sample records by season
Description
This function produces a barplot of FORCIS sample records by season.
Usage
plot_record_by_season(data)
Arguments
data |
a |
Value
A ggplot
object.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Plot data by year (example dataset) ----
plot_record_by_season(net_data)
Plot sample records by year
Description
This function produces a barplot of FORCIS sample records by year.
Usage
plot_record_by_year(data)
Arguments
data |
a |
Value
A ggplot
object.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Plot data by year (example dataset) ----
plot_record_by_year(net_data)
Read FORCIS data
Description
These functions read one specific csv
file of the FORCIS database
(see below) stored in the folder path
. The function download_forcis_db()
must be used first to store locally the database.
Usage
read_cpr_north_data(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update
)
read_cpr_south_data(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update
)
read_plankton_nets_data(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update
)
read_pump_data(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update
)
read_sediment_trap_data(
path,
version = options()$forcis_version,
check_for_update = options()$forcis_check_for_update
)
Arguments
path |
a |
version |
a |
check_for_update |
a |
Details
-
read_plankton_nets_data()
reads the FORCIS plankton nets data -
read_pump_data()
reads the FORCIS pump data -
read_cpr_north_data()
reads the FORCIS CPR North data -
read_cpr_south_data()
reads the FORCIS CPR South data -
read_sediment_trap_data()
reads the FORCIS sediment traps data
Value
A tibble
. See
https://zenodo.org/doi/10.5281/zenodo.7390791 for a preview of the
datasets.
See Also
download_forcis_db()
to download the complete FORCIS database.
Examples
# Folder in which the database will be saved ----
# N.B. In this example we use a temporary folder but you should select an
# existing folder (for instance "data/").
path <- tempdir()
# Download the database ----
download_forcis_db(path, timeout = 300)
# Import plankton nets data ----
plankton_nets_data <- read_plankton_nets_data(path)
Select columns in FORCIS data
Description
Selects columns in FORCIS data. Because FORCIS data contains more than 100
columns, this function can be used to lighten the data.frame
to easily
handle it and to speed up some computations.
Usage
select_forcis_columns(data, cols = NULL)
Arguments
data |
a |
cols |
a |
Value
A tibble
.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Select a taxonomy ----
net_data <- select_taxonomy(net_data, taxonomy = "VT")
# Dimensions of the data.frame ----
dim(net_data)
# Select only required columns (and taxa) ----
net_data <- select_forcis_columns(net_data)
# Dimensions of the data.frame ----
dim(net_data)
Select a taxonomy in FORCIS data
Description
Selects a taxonomy in FORCIS data. FORCIS database provides three different
taxonomies: "LT"
(lumped taxonomy), "VT"
(validated taxonomy) and "OT"
(original taxonomy). See doi:10.1038/s41597-023-02264-2 for further
information.
Usage
select_taxonomy(data, taxonomy)
Arguments
data |
a |
taxonomy |
a |
Value
A tibble
.
Examples
# Import example dataset ----
file_name <- system.file(file.path("extdata", "FORCIS_net_sample.csv"),
package = "forcis")
net_data <- read.csv(file_name)
# Dimensions of the data.frame ----
dim(net_data)
# Select a taxonomy ----
net_data <- select_taxonomy(net_data, taxonomy = "VT")
# Dimensions of the data.frame ----
dim(net_data)