Knowledge Graphs (KGs) may contain large amounts of information; the Monarch Initiative KG for example contains not only millions of nodes and edges, but each node may belong to one or more categories, across dozens of available categories. While edges may only have a single predicate linking a subject and object node, there are similarly dozens of available predicates. Each node category and edge predicate may further come with other node or edge properties, and these may be shared across some but not all node categories or edge predicates.
To help navigate this extensive information, monarchr provides two functions that may be applied to KG engines: a summary() function that counts these categories and predicates across nodes and edges, and an example_graph() function that returns a (non-random) subgraph gauranteed to represent every node category and edge predicate.
As usual, we being by loading the monarchr package, along with tidygraph and dplyr which tend to be useful (but we will not actually use in this vignette).
summary()The summary function, when applied to a KG engine (like file_engine(), neo4j_engine(), or the cloud-hosted monarch_engine()), prints counts of nodes and edges broken out by available node category and edge predicate. To keep the information small, we’ll produce a summary of the included mini-KG containing information about Ehlers-Danlos Syndrome (EDS) and Marfan Syndrome:
The printout reports the number of nodes for each category, and the number of edges for each predicate. It also reports node and edge property counts; neo4j_engines by contrast only list available node and edge properties due to the computational effort required to compute these count data.
This information is also returned (invisibly) as a list; to suppress the printed output we can add quiet = TRUE.
s <- summary(eds_marfan_kg, quiet = TRUE)
paste("Total nodes:", s$total_nodes)
paste("Total edges:", s$total_edges)
head(s$node_summary)
head(s$edge_summary)Finally, the returned list also includes cats and pred entries, which are named lists containing all available category and predicate labels for convenient auto-completion in your favorite IDE.
Autocomplete example in RStudio
The resulting auto-completion inserts the appropriate backtics in RStudio.
example_graph()Fetching a sample of data from a KG is another convenient way to explore its contents, but a random sample is unlikely to illustrate the diversity of available node categories, edge predicates, and information associated with nodes and edges of different types.
To serve this need monarchr provides an example_graph() function, which fetches a sample of nodes and edges that are guaranteed to represent every available category and every available predicate. When using this method, it is important to remember that nodes frequently belong to multiple categories, and the pcategory (“primary category”) column represents one of a set of chosen categories to represent the node. The choice of category shown in pcategory is defined by monarchr, not the KG itself, and is configurable.
Note that this method makes no other guarantees: the sample is not random, the resulting graph may not be connected, the result is not the smallest possible graph that contains all categories and predicates, and nodes and edges may not contain complete data for all possible properties for their respective types. Still, browsing the resulting graph in tabular form as above can quickly reveal the bulk of information available in a KG for further targeted exploration with fetch_nodes() and expand().
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.6.0 RC (2026-04-17 r89917)
## os Ubuntu 24.04.4 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate C
## ctype en_US.UTF-8
## tz America/New_York
## date 2026-05-04
## pandoc 2.7.3 @ /usr/bin/ (via rmarkdown)
## quarto 1.8.25 @ /usr/local/bin/quarto
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## bslib 0.10.0 2026-01-26 [3] CRAN (R 4.6.0)
## cachem 1.1.0 2024-05-16 [3] CRAN (R 4.6.0)
## cli 3.6.6 2026-04-09 [3] CRAN (R 4.6.0)
## digest 0.6.39 2025-11-19 [3] CRAN (R 4.6.0)
## evaluate 1.0.5 2025-08-27 [3] CRAN (R 4.6.0)
## fastmap 1.2.0 2024-05-15 [3] CRAN (R 4.6.0)
## htmltools 0.5.9 2025-12-04 [3] CRAN (R 4.6.0)
## jquerylib 0.1.4 2021-04-26 [3] CRAN (R 4.6.0)
## jsonlite 2.0.0 2025-03-27 [3] CRAN (R 4.6.0)
## knitr 1.51 2025-12-20 [3] CRAN (R 4.6.0)
## lifecycle 1.0.5 2026-01-08 [3] CRAN (R 4.6.0)
## otel 0.2.0 2025-08-29 [3] CRAN (R 4.6.0)
## R6 2.6.1 2025-02-15 [3] CRAN (R 4.6.0)
## rlang 1.2.0 2026-04-06 [3] CRAN (R 4.6.0)
## rmarkdown 2.31 2026-03-26 [3] CRAN (R 4.6.0)
## sass 0.4.10 2025-04-11 [3] CRAN (R 4.6.0)
## sessioninfo 1.2.3 2025-02-05 [3] CRAN (R 4.6.0)
## xfun 0.57 2026-03-20 [3] CRAN (R 4.6.0)
## yaml 2.3.12 2025-12-10 [2] CRAN (R 4.6.0)
##
## [1] /tmp/Rtmp8k0xO5/Rinst22b1e5645fbe01
## [2] /home/pkgbuild/packagebuilder/workers/jobs/4139/R-libs
## [3] /home/biocbuild/bbs-3.23-bioc/R/site-library
## [4] /home/biocbuild/bbs-3.23-bioc/R/library
##
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────