Contents

1 Introduction

Bioconductor is famous for hosting excellent bioinformatics software, but if you look under the hood, you’ll find it relies heavily on a specific R framework called the S4 system.

S4 is used because it has strict rules that ensure complex biological data is formatted correctly, helping to prevent silent errors in your analysis. However, for a biologist transitioning into R development, S4 comes with a notoriously steep learning curve. Compared to the flexibility of standard R data.frames or lists, S4 can feel confusing and overly rigid.

A major goal of Bioconductor is making sure different packages can easily share data. Because of this, the official guidelines strongly recommend reusing existing data structures (called “classes”), like SummarizedExperiment or GRanges, rather than inventing your own.

But there is a practical catch: it is incredibly hard to figure out how these classes are related. S4 classes often build on top of several other “parent” classes, creating a deeply tangled family tree. When you try to use standard R functions to figure out what a class inherits from, you are usually hit with a dense, confusing wall of text in your console. This leaves you manually reading through lines of text just to understand how the data is structured.

S4Cartographer is designed to solve this exact problem. It bridges the gap between Bioconductor’s recommendation to reuse existing classes and the reality of how hard they are to navigate. By providing intuitive tools, this package helps you easily untangle and understand S4 family trees, so you can focus on the biology instead of the programming syntax.

library(S4Cartographer)

2 Examples

2.1 Plotting the “core” Bioconductor S4 classes

Let’s start by exploring S4Vectors, the most basic package in Bioconductor which defines several fundamental S4-classes that are used across hundreds of Bioconductor packages:

plotS4ClassGraph("S4Vectors")

Here we see how the key classes of Rle, Pairs, Hits and Factor all inherit from the same virtual Vector and Annotated classes. A separate branch is formed by various types of List, that all inherit from the same List virtual class.

Let’s now look further and added classes defined in IRanges and GRanges, which build directly on top of each other:

plotS4ClassGraph(c("S4Vectors", "IRanges", "GenomicRanges"))

Now the graph has grown substantially! IRanges defines a large number of specialized sub-classes from the building blocks in S4Vectors. We can also see how GenomicRanges depends on IRanges first and then S4Vectors.

Now let’s go wild and plot the entirety of “core” Bioconductor packages:

plotS4ClassGraph(c("S4Vectors", "IRanges", "GenomicRanges", 
                   "SummarizedExperiment", "XVector", "Biostrings", 
                   "GenomicFeatures", "GenomicAlignments"))

This of course generates a massive graph too large to fit in a vignette, but try it out if you have access to a very large monitor!

2.2 Plotting the DelayedArray sub-graph

DelayedArray is a powerful framework for working with larger-than-memory data. The package forms the core of a network of dependent packages, each implementing their own specialized sub-classes:

plotS4ClassGraph(c("DelayedArray", "HDF5Array", "TileDBArray", "VCFArray", 
                   "ScaledMatrix", "ResidualMatrix", "BiocSingular"))
#> Warning in plotS4ClassGraph(c("DelayedArray", "HDF5Array", "TileDBArray", :
#> Duplicated S4Class definitions: Removing these according to input ordering of
#> packages

2.3 Plotting the SummarizedExperiment sub-graph

SummarizedExperiment is a another popular package for organizing various types of expression matrices. It has a large number of that builds upon this functionality:

plotS4ClassGraph(c("SummarizedExperiment", "SingleCellExperiment", 
                   "SpatialExperiment", "clusterExperiment", "InteractionSet", 
                   "MultiAssayExperiment", "GenomicFiles", "DESeq2"))

2.4 Note for developers

If you want to inspect the underlying data for the graphs, you can make the function return just that:

plotS4ClassGraph(c("SummarizedExperiment", "SingleCellExperiment"), plot=FALSE)
#> $nodes
#>                         S4Class              Package Virtual Union Superclasses
#> 1    RangedSummarizedExperiment SummarizedExperiment   FALSE FALSE            5
#> 2                  SimpleAssays SummarizedExperiment   FALSE FALSE            3
#> 3                Assays_OR_NULL SummarizedExperiment    TRUE  TRUE            0
#> 4                   AssaysInEnv SummarizedExperiment   FALSE FALSE            3
#> 5          SummarizedExperiment SummarizedExperiment   FALSE FALSE            4
#> 6       ShallowSimpleListAssays SummarizedExperiment   FALSE FALSE           10
#> 7                        Assays SummarizedExperiment    TRUE FALSE            2
#> 8                   ShallowData SummarizedExperiment   FALSE FALSE            6
#> 9  SummarizedExperimentByColumn SingleCellExperiment   FALSE FALSE            0
#> 10                   DualSubset SingleCellExperiment   FALSE FALSE            0
#> 11           reduced.dim.matrix SingleCellExperiment    TRUE FALSE            7
#> 12        LinearEmbeddingMatrix SingleCellExperiment   FALSE FALSE            1
#> 13         SingleCellExperiment SingleCellExperiment   FALSE FALSE            6
#> 14              RectangularData                 <NA>      NA    NA           NA
#> 15                       Vector                 <NA>      NA    NA           NA
#> 16                  envRefClass                 <NA>      NA    NA           NA
#> 17                       matrix                 <NA>      NA    NA           NA
#> 18                    Annotated                 <NA>      NA    NA           NA
#>    Subclasses
#> 1           0
#> 2           0
#> 3           6
#> 4           0
#> 5           1
#> 6           0
#> 7           3
#> 8           1
#> 9           0
#> 10          0
#> 11          0
#> 12          0
#> 13          0
#> 14         NA
#> 15         NA
#> 16         NA
#> 17         NA
#> 18         NA
#> 
#> $edges
#>                    Superclass                   Subclass
#> 1  RangedSummarizedExperiment       SummarizedExperiment
#> 2                SimpleAssays                     Assays
#> 3                 AssaysInEnv                     Assays
#> 4        SummarizedExperiment            RectangularData
#> 5        SummarizedExperiment                     Vector
#> 6     ShallowSimpleListAssays                ShallowData
#> 7     ShallowSimpleListAssays                     Assays
#> 8                      Assays            RectangularData
#> 9                      Assays             Assays_OR_NULL
#> 10                ShallowData                envRefClass
#> 11         reduced.dim.matrix                     matrix
#> 12      LinearEmbeddingMatrix                  Annotated
#> 13       SingleCellExperiment RangedSummarizedExperiment

Or use the internal helpers, see ?getS4Classes and ?getS4Inheritance.

2.5 Sessioninfo

sessionInfo()
#> R version 4.6.0 RC (2026-04-17 r89917)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] bigmemory_4.6.4       Biobase_2.71.0        BiocGenerics_0.57.1  
#> [4] generics_0.1.4        S4Cartographer_0.99.3 BiocStyle_2.39.0     
#> 
#> loaded via a namespace (and not attached):
#>   [1] splines_4.6.0               BiocIO_1.21.0              
#>   [3] bitops_1.0-9                tibble_3.3.1               
#>   [5] polyclip_1.10-7             XML_3.99-0.23              
#>   [7] lifecycle_1.0.5             edgeR_4.9.9                
#>   [9] doParallel_1.0.17           lattice_0.22-9             
#>  [11] MASS_7.3-65                 MultiAssayExperiment_1.37.4
#>  [13] GenomicFiles_1.47.0         backports_1.5.1            
#>  [15] magrittr_2.0.5              limma_3.67.3               
#>  [17] sass_0.4.10                 rmarkdown_2.31             
#>  [19] jquerylib_0.1.4             yaml_2.3.12                
#>  [21] otel_0.2.0                  NMF_0.28                   
#>  [23] zinbwave_1.33.0             DBI_1.3.0                  
#>  [25] RColorBrewer_1.1-3          ade4_1.7-24                
#>  [27] ResidualMatrix_1.21.0       abind_1.4-8                
#>  [29] GenomicRanges_1.63.2        purrr_1.2.2                
#>  [31] ggraph_2.2.2                RCurl_1.98-1.18            
#>  [33] VariantAnnotation_1.57.1    tweenr_2.0.3               
#>  [35] IRanges_2.45.0              S4Vectors_0.49.2           
#>  [37] ggrepel_0.9.8               irlba_2.3.7                
#>  [39] genefilter_1.93.0           annotate_1.89.0            
#>  [41] codetools_0.2-20            DelayedArray_0.37.1        
#>  [43] xml2_1.5.2                  ggforce_0.5.0              
#>  [45] tidyselect_1.2.1            locfdr_1.1-8               
#>  [47] RNeXML_2.4.11               UCSC.utils_1.7.1           
#>  [49] farver_2.1.2                ScaledMatrix_1.19.0        
#>  [51] viridis_0.6.5               matrixStats_1.5.0          
#>  [53] stats4_4.6.0                Seqinfo_1.1.0              
#>  [55] GenomicAlignments_1.47.0    jsonlite_2.0.0             
#>  [57] tidygraph_1.3.1             phylobase_0.8.12           
#>  [59] survival_3.8-6              iterators_1.0.14           
#>  [61] TileDBArray_1.21.2          foreach_1.5.2              
#>  [63] tools_4.6.0                 progress_1.2.3             
#>  [65] tiledb_0.33.0               clusterExperiment_2.31.3   
#>  [67] Rcpp_1.1.1-1                glue_1.8.1                 
#>  [69] spdl_0.0.5                  gridExtra_2.3              
#>  [71] SparseArray_1.11.13         BiocBaseUtils_1.13.0       
#>  [73] DESeq2_1.51.7               xfun_0.57                  
#>  [75] MatrixGenerics_1.23.0       ggthemes_5.2.0             
#>  [77] GenomeInfoDb_1.47.2         dplyr_1.2.1                
#>  [79] HDF5Array_1.39.1            withr_3.0.2                
#>  [81] BiocManager_1.30.27         fastmap_1.2.0              
#>  [83] VCFArray_1.27.0             rhdf5filters_1.23.3        
#>  [85] digest_0.6.39               rsvd_1.0.5                 
#>  [87] R6_2.6.1                    colorspace_2.1-2           
#>  [89] dichromat_2.0-0.1           RcppCCTZ_0.2.14            
#>  [91] RSQLite_2.4.6               cigarillo_1.1.0            
#>  [93] h5mread_1.3.3               tidyr_1.3.2                
#>  [95] RcppSpdlog_0.0.28           rtracklayer_1.71.3         
#>  [97] InteractionSet_1.39.0       prettyunits_1.2.0          
#>  [99] graphlayouts_1.2.3          httr_1.4.8                 
#> [101] S4Arrays_1.11.1             pkgconfig_2.0.3            
#> [103] gtable_0.3.6                blob_1.3.0                 
#> [105] registry_0.5-1              S7_0.2.2                   
#> [107] SingleCellExperiment_1.33.2 XVector_0.51.0             
#> [109] htmltools_0.5.9             bookdown_0.46              
#> [111] scales_1.4.0                png_0.1-9                  
#> [113] SpatialExperiment_1.21.0    bigmemory.sri_0.1.8        
#> [115] knitr_1.51                  reshape2_1.4.5             
#> [117] rncl_0.8.9                  rjson_0.2.23               
#> [119] uuid_1.2-2                  checkmate_2.3.4            
#> [121] nlme_3.1-169                curl_7.1.0                 
#> [123] cachem_1.1.0                zoo_1.8-15                 
#> [125] rhdf5_2.55.16               stringr_1.6.0              
#> [127] parallel_4.6.0              softImpute_1.4-3           
#> [129] AnnotationDbi_1.73.1        nanotime_0.3.14            
#> [131] restfulr_0.0.16             pillar_1.11.1              
#> [133] grid_4.6.0                  vctrs_0.7.3                
#> [135] nanoarrow_0.8.0             BiocSingular_1.27.1        
#> [137] beachmat_2.27.5             xtable_1.8-8               
#> [139] cluster_2.1.8.2             evaluate_1.0.5             
#> [141] tinytex_0.59                GenomicFeatures_1.63.2     
#> [143] magick_2.9.1                locfit_1.5-9.12            
#> [145] cli_3.6.6                   compiler_4.6.0             
#> [147] Rsamtools_2.27.2            rngtools_1.5.2             
#> [149] rlang_1.2.0                 crayon_1.5.3               
#> [151] labeling_0.4.3              plyr_1.8.9                 
#> [153] stringi_1.8.7               viridisLite_0.4.3          
#> [155] gridBase_0.4-7              BiocParallel_1.45.0        
#> [157] Biostrings_2.79.5           Matrix_1.7-5               
#> [159] BSgenome_1.79.1             hms_1.1.4                  
#> [161] bit64_4.8.0                 ggplot2_4.0.3              
#> [163] Rhdf5lib_1.99.6             KEGGREST_1.51.1            
#> [165] statmod_1.5.1               SummarizedExperiment_1.41.1
#> [167] kernlab_0.9-33              igraph_2.3.0               
#> [169] memoise_2.0.1               bslib_0.10.0               
#> [171] bit_4.6.0                   ape_5.8-1