S4Cartographer 0.99.3
Bioconductor is famous for hosting excellent bioinformatics software, but if you look under the hood, you’ll find it relies heavily on a specific R framework called the S4 system.
S4 is used because it has strict rules that ensure complex biological data is
formatted correctly, helping to prevent silent errors in your analysis. However,
for a biologist transitioning into R development, S4 comes with a notoriously
steep learning curve. Compared to the flexibility of standard R data.frames or
lists, S4 can feel confusing and overly rigid.
A major goal of Bioconductor is making sure different packages can easily share
data. Because of this, the official guidelines strongly recommend reusing
existing data structures (called “classes”), like SummarizedExperiment or
GRanges, rather than inventing your own.
But there is a practical catch: it is incredibly hard to figure out how these classes are related. S4 classes often build on top of several other “parent” classes, creating a deeply tangled family tree. When you try to use standard R functions to figure out what a class inherits from, you are usually hit with a dense, confusing wall of text in your console. This leaves you manually reading through lines of text just to understand how the data is structured.
S4Cartographer is designed to solve this exact problem. It bridges the gap
between Bioconductor’s recommendation to reuse existing classes and the reality
of how hard they are to navigate. By providing intuitive tools, this package
helps you easily untangle and understand S4 family trees, so you can focus on
the biology instead of the programming syntax.
library(S4Cartographer)
Let’s start by exploring S4Vectors, the most basic package in Bioconductor
which defines several fundamental S4-classes that are used across hundreds of
Bioconductor packages:
plotS4ClassGraph("S4Vectors")
Here we see how the key classes of Rle, Pairs, Hits and Factor all
inherit from the same virtual Vector and Annotated classes. A separate
branch is formed by various types of List, that all inherit from the same
List virtual class.
Let’s now look further and added classes defined in IRanges and GRanges,
which build directly on top of each other:
plotS4ClassGraph(c("S4Vectors", "IRanges", "GenomicRanges"))
Now the graph has grown substantially! IRanges defines a large number of
specialized sub-classes from the building blocks in S4Vectors. We can also see
how GenomicRanges depends on IRanges first and then S4Vectors.
Now let’s go wild and plot the entirety of “core” Bioconductor packages:
plotS4ClassGraph(c("S4Vectors", "IRanges", "GenomicRanges",
"SummarizedExperiment", "XVector", "Biostrings",
"GenomicFeatures", "GenomicAlignments"))
This of course generates a massive graph too large to fit in a vignette, but try it out if you have access to a very large monitor!
DelayedArray sub-graphDelayedArray is a powerful framework for working with larger-than-memory data.
The package forms the core of a network of dependent packages, each implementing
their own specialized sub-classes:
plotS4ClassGraph(c("DelayedArray", "HDF5Array", "TileDBArray", "VCFArray",
"ScaledMatrix", "ResidualMatrix", "BiocSingular"))
#> Warning in plotS4ClassGraph(c("DelayedArray", "HDF5Array", "TileDBArray", :
#> Duplicated S4Class definitions: Removing these according to input ordering of
#> packages
SummarizedExperiment sub-graphSummarizedExperiment is a another popular package for organizing various types
of expression matrices. It has a large number of that builds upon this
functionality:
plotS4ClassGraph(c("SummarizedExperiment", "SingleCellExperiment",
"SpatialExperiment", "clusterExperiment", "InteractionSet",
"MultiAssayExperiment", "GenomicFiles", "DESeq2"))
If you want to inspect the underlying data for the graphs, you can make the function return just that:
plotS4ClassGraph(c("SummarizedExperiment", "SingleCellExperiment"), plot=FALSE)
#> $nodes
#> S4Class Package Virtual Union Superclasses
#> 1 RangedSummarizedExperiment SummarizedExperiment FALSE FALSE 5
#> 2 SimpleAssays SummarizedExperiment FALSE FALSE 3
#> 3 Assays_OR_NULL SummarizedExperiment TRUE TRUE 0
#> 4 AssaysInEnv SummarizedExperiment FALSE FALSE 3
#> 5 SummarizedExperiment SummarizedExperiment FALSE FALSE 4
#> 6 ShallowSimpleListAssays SummarizedExperiment FALSE FALSE 10
#> 7 Assays SummarizedExperiment TRUE FALSE 2
#> 8 ShallowData SummarizedExperiment FALSE FALSE 6
#> 9 SummarizedExperimentByColumn SingleCellExperiment FALSE FALSE 0
#> 10 DualSubset SingleCellExperiment FALSE FALSE 0
#> 11 reduced.dim.matrix SingleCellExperiment TRUE FALSE 7
#> 12 LinearEmbeddingMatrix SingleCellExperiment FALSE FALSE 1
#> 13 SingleCellExperiment SingleCellExperiment FALSE FALSE 6
#> 14 RectangularData <NA> NA NA NA
#> 15 Vector <NA> NA NA NA
#> 16 envRefClass <NA> NA NA NA
#> 17 matrix <NA> NA NA NA
#> 18 Annotated <NA> NA NA NA
#> Subclasses
#> 1 0
#> 2 0
#> 3 6
#> 4 0
#> 5 1
#> 6 0
#> 7 3
#> 8 1
#> 9 0
#> 10 0
#> 11 0
#> 12 0
#> 13 0
#> 14 NA
#> 15 NA
#> 16 NA
#> 17 NA
#> 18 NA
#>
#> $edges
#> Superclass Subclass
#> 1 RangedSummarizedExperiment SummarizedExperiment
#> 2 SimpleAssays Assays
#> 3 AssaysInEnv Assays
#> 4 SummarizedExperiment RectangularData
#> 5 SummarizedExperiment Vector
#> 6 ShallowSimpleListAssays ShallowData
#> 7 ShallowSimpleListAssays Assays
#> 8 Assays RectangularData
#> 9 Assays Assays_OR_NULL
#> 10 ShallowData envRefClass
#> 11 reduced.dim.matrix matrix
#> 12 LinearEmbeddingMatrix Annotated
#> 13 SingleCellExperiment RangedSummarizedExperiment
Or use the internal helpers, see ?getS4Classes and ?getS4Inheritance.
sessionInfo()
#> R version 4.6.0 RC (2026-04-17 r89917)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] bigmemory_4.6.4 Biobase_2.71.0 BiocGenerics_0.57.1
#> [4] generics_0.1.4 S4Cartographer_0.99.3 BiocStyle_2.39.0
#>
#> loaded via a namespace (and not attached):
#> [1] splines_4.6.0 BiocIO_1.21.0
#> [3] bitops_1.0-9 tibble_3.3.1
#> [5] polyclip_1.10-7 XML_3.99-0.23
#> [7] lifecycle_1.0.5 edgeR_4.9.9
#> [9] doParallel_1.0.17 lattice_0.22-9
#> [11] MASS_7.3-65 MultiAssayExperiment_1.37.4
#> [13] GenomicFiles_1.47.0 backports_1.5.1
#> [15] magrittr_2.0.5 limma_3.67.3
#> [17] sass_0.4.10 rmarkdown_2.31
#> [19] jquerylib_0.1.4 yaml_2.3.12
#> [21] otel_0.2.0 NMF_0.28
#> [23] zinbwave_1.33.0 DBI_1.3.0
#> [25] RColorBrewer_1.1-3 ade4_1.7-24
#> [27] ResidualMatrix_1.21.0 abind_1.4-8
#> [29] GenomicRanges_1.63.2 purrr_1.2.2
#> [31] ggraph_2.2.2 RCurl_1.98-1.18
#> [33] VariantAnnotation_1.57.1 tweenr_2.0.3
#> [35] IRanges_2.45.0 S4Vectors_0.49.2
#> [37] ggrepel_0.9.8 irlba_2.3.7
#> [39] genefilter_1.93.0 annotate_1.89.0
#> [41] codetools_0.2-20 DelayedArray_0.37.1
#> [43] xml2_1.5.2 ggforce_0.5.0
#> [45] tidyselect_1.2.1 locfdr_1.1-8
#> [47] RNeXML_2.4.11 UCSC.utils_1.7.1
#> [49] farver_2.1.2 ScaledMatrix_1.19.0
#> [51] viridis_0.6.5 matrixStats_1.5.0
#> [53] stats4_4.6.0 Seqinfo_1.1.0
#> [55] GenomicAlignments_1.47.0 jsonlite_2.0.0
#> [57] tidygraph_1.3.1 phylobase_0.8.12
#> [59] survival_3.8-6 iterators_1.0.14
#> [61] TileDBArray_1.21.2 foreach_1.5.2
#> [63] tools_4.6.0 progress_1.2.3
#> [65] tiledb_0.33.0 clusterExperiment_2.31.3
#> [67] Rcpp_1.1.1-1 glue_1.8.1
#> [69] spdl_0.0.5 gridExtra_2.3
#> [71] SparseArray_1.11.13 BiocBaseUtils_1.13.0
#> [73] DESeq2_1.51.7 xfun_0.57
#> [75] MatrixGenerics_1.23.0 ggthemes_5.2.0
#> [77] GenomeInfoDb_1.47.2 dplyr_1.2.1
#> [79] HDF5Array_1.39.1 withr_3.0.2
#> [81] BiocManager_1.30.27 fastmap_1.2.0
#> [83] VCFArray_1.27.0 rhdf5filters_1.23.3
#> [85] digest_0.6.39 rsvd_1.0.5
#> [87] R6_2.6.1 colorspace_2.1-2
#> [89] dichromat_2.0-0.1 RcppCCTZ_0.2.14
#> [91] RSQLite_2.4.6 cigarillo_1.1.0
#> [93] h5mread_1.3.3 tidyr_1.3.2
#> [95] RcppSpdlog_0.0.28 rtracklayer_1.71.3
#> [97] InteractionSet_1.39.0 prettyunits_1.2.0
#> [99] graphlayouts_1.2.3 httr_1.4.8
#> [101] S4Arrays_1.11.1 pkgconfig_2.0.3
#> [103] gtable_0.3.6 blob_1.3.0
#> [105] registry_0.5-1 S7_0.2.2
#> [107] SingleCellExperiment_1.33.2 XVector_0.51.0
#> [109] htmltools_0.5.9 bookdown_0.46
#> [111] scales_1.4.0 png_0.1-9
#> [113] SpatialExperiment_1.21.0 bigmemory.sri_0.1.8
#> [115] knitr_1.51 reshape2_1.4.5
#> [117] rncl_0.8.9 rjson_0.2.23
#> [119] uuid_1.2-2 checkmate_2.3.4
#> [121] nlme_3.1-169 curl_7.1.0
#> [123] cachem_1.1.0 zoo_1.8-15
#> [125] rhdf5_2.55.16 stringr_1.6.0
#> [127] parallel_4.6.0 softImpute_1.4-3
#> [129] AnnotationDbi_1.73.1 nanotime_0.3.14
#> [131] restfulr_0.0.16 pillar_1.11.1
#> [133] grid_4.6.0 vctrs_0.7.3
#> [135] nanoarrow_0.8.0 BiocSingular_1.27.1
#> [137] beachmat_2.27.5 xtable_1.8-8
#> [139] cluster_2.1.8.2 evaluate_1.0.5
#> [141] tinytex_0.59 GenomicFeatures_1.63.2
#> [143] magick_2.9.1 locfit_1.5-9.12
#> [145] cli_3.6.6 compiler_4.6.0
#> [147] Rsamtools_2.27.2 rngtools_1.5.2
#> [149] rlang_1.2.0 crayon_1.5.3
#> [151] labeling_0.4.3 plyr_1.8.9
#> [153] stringi_1.8.7 viridisLite_0.4.3
#> [155] gridBase_0.4-7 BiocParallel_1.45.0
#> [157] Biostrings_2.79.5 Matrix_1.7-5
#> [159] BSgenome_1.79.1 hms_1.1.4
#> [161] bit64_4.8.0 ggplot2_4.0.3
#> [163] Rhdf5lib_1.99.6 KEGGREST_1.51.1
#> [165] statmod_1.5.1 SummarizedExperiment_1.41.1
#> [167] kernlab_0.9-33 igraph_2.3.0
#> [169] memoise_2.0.1 bslib_0.10.0
#> [171] bit_4.6.0 ape_5.8-1