Version: | 0.1.7 |
Title: | Core 'hubverse' Utilities |
Description: | Core set of low-level utilities common across the 'hubverse'. Used to interact with 'hubverse' schema, Hub configuration files and model outputs and designed to be primarily used internally by other 'hubverse' packages. See Reich et al. (2022) <doi:10.2105/AJPH.2022.306831> for an overview of Collaborative Hubs. |
License: | MIT + file LICENSE |
URL: | https://github.com/hubverse-org/hubUtils, https://hubverse-org.github.io/hubUtils/ |
BugReports: | https://github.com/hubverse-org/hubUtils/issues |
Depends: | R (≥ 2.10) |
Imports: | checkmate, cli, curl, fs, gh, glue, jsonlite, lifecycle, magrittr, memoise, purrr, rlang, stringr, tibble, utils |
Suggests: | arrow (≥ 17.0.0), dplyr, knitr, rmarkdown, testthat (≥ 3.2.0) |
Config/Needs/website: | hubverse-org/hubStyle |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2024-09-18 12:31:07 UTC; Anna |
Author: | Anna Krystalli |
Maintainer: | Anna Krystalli <annakrystalli@googlemail.com> |
Repository: | CRAN |
Date/Publication: | 2024-09-18 14:00:01 UTC |
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Arguments
lhs |
A value or the |
rhs |
A function call using the |
Value
The result of calling rhs(lhs)
.
Convert model output to a model_out_tbl
class object.
Description
Convert model output to a model_out_tbl
class object.
Usage
as_model_out_tbl(
tbl,
model_id_col = NULL,
output_type_col = NULL,
output_type_id_col = NULL,
value_col = NULL,
sep = "-",
trim_to_task_ids = FALSE,
hub_con = NULL,
task_id_cols = NULL,
remove_empty = FALSE
)
Arguments
tbl |
a |
model_id_col |
character string. If a |
output_type_col |
character string. If an |
output_type_id_col |
character string. If an |
value_col |
character string. If a |
sep |
character string. Character used as separator when concatenating
|
trim_to_task_ids |
logical. Whether to trim |
hub_con |
a |
task_id_cols |
a character vector of column names. Only used if
|
remove_empty |
Logical. Whether to remove columns containing only |
Value
A model_out_tbl
class object.
Examples
as_model_out_tbl(hub_con_output)
Check whether a config file is using a deprecated schema
Description
Function compares the current schema version in a config file to a valid version, If config file version deprecated compared to valid version, the function issues a lifecycle warning to prompt user to upgrade.
Usage
check_deprecated_schema(
config_version,
config,
valid_version = "v2.0.0",
hubutils_version = "0.0.0.9010"
)
Arguments
config_version |
Character string of the schema version. |
config |
List representation of config file. |
valid_version |
Character string of minimum valid schema version. |
hubutils_version |
The version of the hubUtils package in which deprecation of
the schema version below |
Value
Invisibly, TRUE
if the schema version is deprecated, FALSE
otherwise.
Primarily used for the side effect of issuing a lifecycle warning.
Extract the schema version from a schema id
or config schema_version
property
character string
Description
Extract the schema version from a schema id
or config schema_version
property
character string
Usage
extract_schema_version(id)
Arguments
id |
A schema |
Value
The schema version number as a character string.
Examples
extract_schema_version("schema_version: v1.0.0")
Get the name of the output type id column based on the schema version
Description
Version can be provided either directly through the config_version
argument
or extracted from a config_tasks
object.
Usage
get_config_tid(config_version, config_tasks)
Arguments
config_version |
Character string of the schema version. |
config_tasks |
a list version of the content's of a hub's |
Value
character string of the name of the output type id column
Examples
get_config_tid("v1.0.0")
get_config_tid("v2.0.0")
Utilities for accessing round ID metadata
Description
Utilities for accessing round ID metadata
Usage
get_round_idx(config_tasks, round_id)
get_round_ids(
config_tasks,
flatten = c("all", "model_task", "task_id", "none")
)
Arguments
config_tasks |
a list version of the content's of a hub's |
round_id |
Character string. Round identifier. If the round is set to
|
flatten |
Character. Whether and how much to flatten output.
|
Value
the integer index of the element in config_tasks$rounds
that a
character round identifier maps to
a list or character vector of hub round IDs
A character vector is returned only if
flatten = "all"
A list is returned otherwise (see
flatten
for more details)
Functions
-
get_round_idx()
: Get an integer index of the element inconfig_tasks$rounds
that a character round identifier maps to. -
get_round_ids()
: Get a list or character vector of hub round IDs. For each round, ifround_id_from_variable
isTRUE
, round IDs returned are the values of the task ID defined in theround_id
property. Otherwise, ifround_id_from_variable
isFALSE
, the value of theround_id
property is returned.
Examples
config_tasks <- read_config(
hub_path = system.file("testhubs/simple", package = "hubUtils")
)
# Get round IDs
get_round_ids(config_tasks)
get_round_ids(config_tasks, flatten = "model_task")
get_round_ids(config_tasks, flatten = "task_id")
get_round_ids(config_tasks, flatten = "none")
# Get round integer index using a round_id
get_round_idx(config_tasks, "2022-10-01")
get_round_idx(config_tasks, "2022-10-29")
Get the model tasks for a given round
Description
Get the model tasks for a given round
Usage
get_round_model_tasks(config_tasks, round_id)
Arguments
config_tasks |
a list version of the content's of a hub's |
round_id |
Character string. Round identifier. If the round is set to
|
Value
a list representation of model tasks for a given round.
Examples
hub_path <- system.file("testhubs/simple", package = "hubUtils")
config_tasks <- read_config(hub_path, "tasks")
get_round_model_tasks(config_tasks, round_id = "2022-10-08")
get_round_model_tasks(config_tasks, round_id = "2022-10-15")
Get task ID names for a given round
Description
Get task ID names for a given round
Usage
get_round_task_id_names(config_tasks, round_id)
Arguments
config_tasks |
a list version of the content's of a hub's |
round_id |
Character string. Round identifier. If the round is set to
|
Value
a character vector of task ID names
Examples
hub_path <- system.file("testhubs/simple", package = "hubUtils")
config_tasks <- read_config(hub_path, "tasks")
get_round_task_id_names(config_tasks, round_id = "2022-10-08")
get_round_task_id_names(config_tasks, round_id = "2022-10-15")
Download a schema
Description
Download a schema
Usage
get_schema(schema_url)
Arguments
schema_url |
The download URL for a given config schema version. |
Value
Contents of the JSON schema as a character string.
See Also
Other functions supporting config file validation:
get_schema_url()
,
get_schema_valid_versions()
Examples
schema_url <- get_schema_url(config = "tasks", version = "v0.0.0.9")
get_schema(schema_url)
Get the JSON schema download URL for a given config file version
Description
Get the JSON schema download URL for a given config file version
Usage
get_schema_url(config = c("tasks", "admin", "model"), version, branch = "main")
Arguments
config |
Name of config file to validate. One of |
version |
A valid version of hubverse
schema
(e.g. |
branch |
The branch of the hubverse
schemas repository
from which to fetch schema. Defaults to |
Value
The JSON schema download URL for a given config file version.
See Also
Other functions supporting config file validation:
get_schema()
,
get_schema_valid_versions()
Examples
get_schema_url(config = "tasks", version = "v0.0.0.9")
Get a vector of valid schema version
Description
Get a vector of valid schema version
Usage
get_schema_valid_versions(branch = "main")
Arguments
branch |
The branch of the hubverse
schemas repository
from which to fetch schema. Defaults to |
Value
a character vector of valid versions of hubverse schema.
See Also
Other functions supporting config file validation:
get_schema()
,
get_schema_url()
Examples
get_schema_valid_versions()
Get the latest schema version
Description
Get the latest schema version from the schema repository if "latest" requested (default) or ignore if specific version provided.
Usage
get_schema_version_latest(schema_version = "latest", branch = "main")
Arguments
schema_version |
A character vector. Either "latest" or a valid schema version. |
branch |
The branch of the hubverse
schemas repository
from which to fetch schema. Defaults to |
Value
a schema version string. If schema_version
is "latest", the latest schema
version from the schema repository. If specific version provided to schema_version
, the same version is returned.
Examples
# Get the latest version of the schema
get_schema_version_latest()
get_schema_version_latest(schema_version = "v1.0.0")
Get hub task IDs
Description
Get hub task IDs
Usage
get_task_id_names(config_tasks)
Arguments
config_tasks |
a list version of the content's of a hub's |
Value
a character vector of all unique task ID names across all rounds.
Examples
hub_path <- system.file("testhubs/simple", package = "hubUtils")
config_tasks <- read_config(hub_path, "tasks")
get_task_id_names(config_tasks)
Example Hub model output data
Description
A subset of model output data accessed using hubData
from the simple example
hub contained in the hubUtils
package. The subset consists of "quantile"
output
type data for "US"
location and the most recent forecast date.
Usage
hub_con_output
Format
A tbl
with 92 rows and 8 columns:
-
forecast_date
: Origin date of the forecast. -
horizon
: Forecast horizon relative to theforecast_date
. -
target
: Target variable. -
location
: Location of the forecast. -
output_type
: Output type of forecast. -
output_type_id
: Forecast output type level/identifier. In this case, quantile level. -
value
: Forecast value. -
model_id
: Model identifier.
Is config list representation using v3.0.0 schema?
Description
Is config list representation using v3.0.0 schema?
Usage
is_v3_config(config)
Arguments
config |
List representation of the JSON config file. |
Value
Logical, whether the config list representation is using v3.0.0 schema.
Examples
config <- read_config_file(
system.file("config", "tasks.json", package = "hubUtils")
)
is_v3_config(config)
Is config file using v3.0.0 schema?
Description
Is config file using v3.0.0 schema?
Usage
is_v3_config_file(config_path)
Arguments
config_path |
Path to the config file. |
Value
Logical, whether the config file is using v3.0.0 schema.
Examples
config_path <- system.file("config", "tasks.json", package = "hubUtils")
is_v3_config_file(config_path)
Is hub configured using v3.0.0 schema?
Description
Is hub configured using v3.0.0 schema?
Usage
is_v3_hub(hub_path, config = c("tasks", "admin"))
Arguments
hub_path |
Either a character string path to a local Modeling Hub directory
or an object of class |
config |
Name of config file to validate. One of |
Value
Logical, whether the hub is configured using v3.0.0 schema.
Examples
is_v3_hub(hub_path = system.file("testhubs", "flusight", package = "hubUtils"))
Merge/Split model output tbl model_id
column
Description
Merge/Split model output tbl model_id
column
Usage
model_id_merge(tbl, sep = "-")
model_id_split(tbl, sep = "-")
Arguments
tbl |
a |
sep |
character string. Character used as separator when concatenating
|
Value
tbl
with either team_abbr
and model_abbr
merged into a single model_id
column or model_id
split into columns team_abbr
and model_abbr
.
a tibble with model_id
column split into separate
team_abbr
and model_abbr
columns
Functions
-
model_id_merge()
: mergeteam_abbr
andmodel_abbr
into a singlemodel_id
column. -
model_id_split()
: splitmodel_id
column into separateteam_abbr
andmodel_abbr
columns.
Examples
tbl_split <- model_id_split(hub_con_output)
tbl_split
# Merge model_id
tbl_merged <- model_id_merge(tbl_split)
tbl_merged
# Split / Merge using custom separator
tbl_sep <- hub_con_output
tbl_sep$model_id <- gsub("-", "_", tbl_sep$model_id)
tbl_sep <- model_id_split(tbl_sep, sep = "_")
tbl_sep
tbl_sep <- model_id_merge(tbl_sep, sep = "_")
tbl_sep
Read a hub config file into R
Description
Read a hub config file into R
Usage
read_config(hub_path, config = c("tasks", "admin", "model-metadata-schema"))
Arguments
hub_path |
Either a character string path to a local Modeling Hub directory
or an object of class |
config |
Name of config file to validate. One of |
Value
The contents of the config as an R list.
Examples
# Read config files from local hub
hub_path <- system.file("testhubs/simple", package = "hubUtils")
read_config(hub_path, "tasks")
read_config(hub_path, "admin")
# Read config file from AWS S3 bucket hub
hub_path <- arrow::s3_bucket("hubverse/hubutils/testhubs/simple/")
read_config(hub_path, "admin")
Read a JSON config file from a path
Description
Read a JSON config file from a path
Usage
read_config_file(config_path)
Arguments
config_path |
path to JSON config file |
Value
a list representation of the JSON config file
Examples
read_config_file(system.file("config", "tasks.json", package = "hubUtils"))
Hubverse model output standard column names
Description
A named character string of standard column names used in hubverse model output data files. The terms currently used for standard column names in the hubverse are English. In future, however, this could be expanded to provide the basis for hub terminology localisation.
Usage
std_colnames
Format
An object of class character
of length 4.
Validate a model_out_tbl
object.
Description
Validate a model_out_tbl
object.
Usage
validate_model_out_tbl(tbl)
Arguments
tbl |
a |
Value
If valid, returns a model_out_tbl
class object. Otherwise, throws an
error.
Examples
md_out <- as_model_out_tbl(hub_con_output)
validate_model_out_tbl(md_out)