Engines in monarch provide an abstraction over a Knowledge Graph (KG), supporting fetching nodes, expanding, and summaries. When instantiating an engine, we can optionally provide a preferences list to adjust default behaviors.
To see the default preferences for an engine, we can simply inspect its preferences entry:
The most important entry here is category_priority. KGX-formatted KGs must label nodes with a multi-valued category. In some KGs this may contain a single value; a Gene for example may be labeled with only c("biolink:Gene"). In other KGs, including the Monarch KG, entities may be labeled with multiple categories, for example c("biolink:GenomicEntity", "biolink:Entity", "biolink:Gene", "biolink:NamedThing"). KGX does not specify an order for these (though in KGX Biolink labels exist in a hierarchy, a biolink:Gene is a type of biolink:GenomicEntity, which is a type of biolink:Entity, and so on).
In practice, however, a single category is typically most relevant. For Genes we usually care about their biolink:Gene category, diseases biolink:Disease, and so on. In monarchr, this “primary category” is represented by nodes’ pcategory column for convenience:
library(dplyr)
data(eds_marfan_kg)
g <- eds_marfan_kg |>
fetch_nodes(query_ids = "HP:0001788") |>
expand(categories = "biolink:Disease") |>
expand(categories = "biolink:Gene")
nodes(g) |> select(id, name, pcategory, category)Which entry of category is chosen for pcategory is determined by the engines’ $preferences$category_priority. For each node, the first entry that is present in category is used, if none are, the first entry of category is.
We can adjust this when initializing the engine. To do so in this example we need to load the KG from file with file_engine().
filename <- filename <- system.file("extdata", "eds_marfan_kg.tar.gz", package = "monarchr")
eds_marfan_kg <- file_engine(filename,
preferences = list(category_priority = c(
"biolink:GenomicEntity",
"biolink:DiseaseOrPhenotypicFeature"
))
)
g <- eds_marfan_kg |>
fetch_nodes(query_ids = "HP:0001788") |>
expand(categories = "biolink:Disease") |>
expand(categories = "biolink:Gene")
nodes(g) |> select(id, name, pcategory, category)The default category_priority list is designed to preferentially assign most-specific categories to pcategory and should work well in most uses.
Other preferences common to both file_engines and neo4j_engines include:
node_property_priority - defines a subset of column names to include first in node data framesegde_property_priority - defines a subset of column names to include first in edge data framesBy default, neo4j_engines and monarch_engines cache queries for the duration of the R session, speeding exploratory analyses. This can be disabled, but is not (currently) controlled by preferences, and is instead a parameter to the engine constructor:
monarch <- monarch_engine(cache = FALSE)
g1 <- monarch |>
fetch_nodes(query_ids = "HP:0001788") |>
expand()
g2 <- monarch |>
fetch_nodes(query_ids = "HP:0001788") |>
expand()In the above example, because caching is disabled, the fetch and expansion are re-run in computing g2.
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.6.0 RC (2026-04-17 r89917)
## os Ubuntu 24.04.4 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate C
## ctype en_US.UTF-8
## tz America/New_York
## date 2026-05-04
## pandoc 2.7.3 @ /usr/bin/ (via rmarkdown)
## quarto 1.8.25 @ /usr/local/bin/quarto
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## bslib 0.10.0 2026-01-26 [3] CRAN (R 4.6.0)
## cachem 1.1.0 2024-05-16 [3] CRAN (R 4.6.0)
## cli 3.6.6 2026-04-09 [3] CRAN (R 4.6.0)
## digest 0.6.39 2025-11-19 [3] CRAN (R 4.6.0)
## evaluate 1.0.5 2025-08-27 [3] CRAN (R 4.6.0)
## fastmap 1.2.0 2024-05-15 [3] CRAN (R 4.6.0)
## htmltools 0.5.9 2025-12-04 [3] CRAN (R 4.6.0)
## jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.6.0)
## jsonlite 2.0.0 2025-03-27 [3] CRAN (R 4.6.0)
## knitr 1.51 2025-12-20 [3] CRAN (R 4.6.0)
## lifecycle 1.0.5 2026-01-08 [3] CRAN (R 4.6.0)
## otel 0.2.0 2025-08-29 [3] CRAN (R 4.6.0)
## R6 2.6.1 2025-02-15 [3] CRAN (R 4.6.0)
## rlang 1.2.0 2026-04-06 [3] CRAN (R 4.6.0)
## rmarkdown 2.31 2026-03-26 [3] CRAN (R 4.6.0)
## sass 0.4.10 2025-04-11 [3] CRAN (R 4.6.0)
## sessioninfo 1.2.3 2025-02-05 [3] CRAN (R 4.6.0)
## xfun 0.57 2026-03-20 [3] CRAN (R 4.6.0)
## yaml 2.3.12 2025-12-10 [2] CRAN (R 4.6.0)
##
## [1] /tmp/Rtmp8k0xO5/Rinst22b1e5645fbe01
## [2] /home/pkgbuild/packagebuilder/workers/jobs/4139/R-libs
## [3] /home/biocbuild/bbs-3.23-bioc/R/site-library
## [4] /home/biocbuild/bbs-3.23-bioc/R/library
##
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────