Contents

1 CENTREprecomputed: An Experiment Data package interface with the precomputed

data and example data for CENTRE

CENTRE is a package for Cell-type specific ENhancer Target pREdiction, that follows this workflow:

createPairs() -> computeGenericFeatures() -> computeCellTypeFeatures() -> centreClassification()

The step CENTRE::computeGenericFeatures() computes the genomic distance and gets the precomputed features saved in the PrecomputedDataLight.db SQLite database. Visit vignette("Centre-vignette") for more information.

All of the data in the CENTREprecomputed package can be accessed through ExperimentHub:

library(ExperimentHub, quietly = TRUE)
## 
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
##     setequal, union
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
##     as.data.frame, basename, cbind, colnames, dirname, do.call,
##     duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
##     mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min
hub <- ExperimentHub()
eh <- query(hub, "CENTREprecomputed")
eh
## ExperimentHub with 6 records
## # snapshotDate(): 2025-07-17
## # $dataprovider: ENCODE, FANTOM5, big.databio.org and MPI for molecular gene...
## # $species: Homo sapiens
## # $rdataclass: character, SQLiteConnection
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["EH9540"]]' 
## 
##            title                                                              
##   EH9540 | Precomputed Fisher combined p-values and crup correlations database
##   EH9541 | H3K4me3 ChIP-seq HeLa-S3 from ENCODE                               
##   EH9542 | H3K4me1 ChIP-seq HeLa-S3 from ENCODE                               
##   EH9543 | H3K27ac ChIP-seq HeLa-S3 from ENCODE                               
##   EH9544 | Control ChIP-seq HeLa-S3 from ENCODE                               
##   EH9545 | RNA-seq gene quantifications HeLa-S3 from ENCODE

The records EH9541-EH9545 are the example and test data for the CENTRE package. Record EH9540 is the SQLite database which is accessed through eh[["EH9540"]] and returns a CENTREprecompDb object.

1.1 The CENTREprecompDb object

The R object to represent the precomputed database is CENTREprecompDb. To get the slots for class run:

library(CENTREprecomputed)
centreprecompdb <- eh[["EH9540"]]
## see ?CENTREprecomputed and browseVignettes('CENTREprecomputed') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache

There are three tables inside the database:

  • combinedTestData: -log transformed p-values of the Wilcoxon rank sum tests.
  • crup_cor: correlations of CRUP-EP scores and CRUP-PP scores across cell-types.
  • metadata: metadata for the ExperimentHub

For more information check vignette("Centre-vignette") or the CENTRE publication

The database can be used as follows:

# get all tables and their columns, won't show metadata table
tables(centreprecompdb)
## $combinedTestData
## [1] "pair"           "combined_tests"
## 
## $crup_cor
## [1] "gene_id1" "symbol38" "cor_CRUP" "pair"
# Select all cor_CRUP coefficients of enhancer EH38E3440167. Return the pair
# and ID column
fetch_data(centreprecompdb,
    table = "crup_cor",
    columns = c("pair", "cor_CRUP"),
    entries = "EH38E3440167",
    column_filter = "symbol38"
)
##                            pair     cor_CRUP
## 1  EH38E3440167_ENSG00000000419 -0.006860231
## 2  EH38E3440167_ENSG00000026559 -0.004848918
## 3  EH38E3440167_ENSG00000054793  0.012584842
## 4  EH38E3440167_ENSG00000101096  0.191653996
## 5  EH38E3440167_ENSG00000101115 -0.014031516
## 6  EH38E3440167_ENSG00000124217 -0.051373777
## 7  EH38E3440167_ENSG00000215444 -0.030953579
## 8  EH38E3440167_ENSG00000227964 -0.080207418
## 9  EH38E3440167_ENSG00000228820 -0.055395912
## 10 EH38E3440167_ENSG00000232358 -0.128205852
## 11 EH38E3440167_ENSG00000252467 -0.015644308
## 12 EH38E3440167_ENSG00000252654 -0.143228865
## 13 EH38E3440167_ENSG00000266761  0.186533710
## 14 EH38E3440167_ENSG00000279835 -0.050432755
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] CENTREprecomputed_0.99.2 ExperimentHub_2.99.5     AnnotationHub_3.99.6    
## [4] BiocFileCache_2.99.5     dbplyr_2.5.0             BiocGenerics_0.55.1     
## [7] generics_0.1.4           BiocStyle_2.37.1        
## 
## loaded via a namespace (and not attached):
##  [1] rappdirs_0.3.3       sass_0.4.10          BiocVersion_3.22.0  
##  [4] RSQLite_2.4.2        digest_0.6.37        magrittr_2.0.3      
##  [7] evaluate_1.0.4       bookdown_0.43        fastmap_1.2.0       
## [10] blob_1.2.4           jsonlite_2.0.0       AnnotationDbi_1.71.1
## [13] DBI_1.2.3            BiocManager_1.30.26  httr_1.4.7          
## [16] purrr_1.1.0          Biostrings_2.77.2    httr2_1.2.1         
## [19] jquerylib_0.1.4      cli_3.6.5            crayon_1.5.3        
## [22] rlang_1.1.6          XVector_0.49.0       Biobase_2.69.0      
## [25] bit64_4.6.0-1        withr_3.0.2          cachem_1.1.0        
## [28] yaml_2.3.10          tools_4.5.1          memoise_2.0.1       
## [31] dplyr_1.1.4          filelock_1.0.3       curl_6.4.0          
## [34] vctrs_0.6.5          R6_2.6.1             png_0.1-8           
## [37] stats4_4.5.1         lifecycle_1.0.4      Seqinfo_0.99.2      
## [40] KEGGREST_1.49.1      S4Vectors_0.47.0     IRanges_2.43.0      
## [43] bit_4.6.0            pkgconfig_2.0.3      pillar_1.11.0       
## [46] bslib_0.9.0          glue_1.8.0           xfun_0.52           
## [49] tibble_3.3.0         tidyselect_1.2.1     knitr_1.50          
## [52] htmltools_0.5.8.1    rmarkdown_2.29       compiler_4.5.1