data and example data for CENTRE
CENTRE is a package for Cell-type specific ENhancer Target pREdiction, that follows this workflow:
createPairs()
-> computeGenericFeatures()
-> computeCellTypeFeatures()
->
centreClassification()
The step CENTRE::computeGenericFeatures()
computes the genomic distance and
gets the precomputed features saved in the PrecomputedDataLight.db
SQLite database. Visit vignette("Centre-vignette")
for more information.
All of the data in the CENTREprecomputed package can be accessed through ExperimentHub:
library(ExperimentHub, quietly = TRUE)
##
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
##
## as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
## setequal, union
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
## as.data.frame, basename, cbind, colnames, dirname, do.call,
## duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
## mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
## rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
## unsplit, which.max, which.min
hub <- ExperimentHub()
eh <- query(hub, "CENTREprecomputed")
eh
## ExperimentHub with 6 records
## # snapshotDate(): 2025-07-17
## # $dataprovider: ENCODE, FANTOM5, big.databio.org and MPI for molecular gene...
## # $species: Homo sapiens
## # $rdataclass: character, SQLiteConnection
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH9540"]]'
##
## title
## EH9540 | Precomputed Fisher combined p-values and crup correlations database
## EH9541 | H3K4me3 ChIP-seq HeLa-S3 from ENCODE
## EH9542 | H3K4me1 ChIP-seq HeLa-S3 from ENCODE
## EH9543 | H3K27ac ChIP-seq HeLa-S3 from ENCODE
## EH9544 | Control ChIP-seq HeLa-S3 from ENCODE
## EH9545 | RNA-seq gene quantifications HeLa-S3 from ENCODE
The records EH9541-EH9545 are the example and test data for the CENTRE package.
Record EH9540 is the SQLite database which is accessed through eh[["EH9540"]]
and returns a CENTREprecompDb object.
The R object to represent the precomputed database is CENTREprecompDb
.
To get the slots for class run:
library(CENTREprecomputed)
centreprecompdb <- eh[["EH9540"]]
## see ?CENTREprecomputed and browseVignettes('CENTREprecomputed') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
There are three tables inside the database:
combinedTestData
: -log transformed p-values of the Wilcoxon rank sum tests.crup_cor
: correlations of CRUP-EP scores and CRUP-PP scores across
cell-types.metadata
: metadata for the ExperimentHubFor more information check vignette("Centre-vignette")
or the
CENTRE publication
The database can be used as follows:
# get all tables and their columns, won't show metadata table
tables(centreprecompdb)
## $combinedTestData
## [1] "pair" "combined_tests"
##
## $crup_cor
## [1] "gene_id1" "symbol38" "cor_CRUP" "pair"
# Select all cor_CRUP coefficients of enhancer EH38E3440167. Return the pair
# and ID column
fetch_data(centreprecompdb,
table = "crup_cor",
columns = c("pair", "cor_CRUP"),
entries = "EH38E3440167",
column_filter = "symbol38"
)
## pair cor_CRUP
## 1 EH38E3440167_ENSG00000000419 -0.006860231
## 2 EH38E3440167_ENSG00000026559 -0.004848918
## 3 EH38E3440167_ENSG00000054793 0.012584842
## 4 EH38E3440167_ENSG00000101096 0.191653996
## 5 EH38E3440167_ENSG00000101115 -0.014031516
## 6 EH38E3440167_ENSG00000124217 -0.051373777
## 7 EH38E3440167_ENSG00000215444 -0.030953579
## 8 EH38E3440167_ENSG00000227964 -0.080207418
## 9 EH38E3440167_ENSG00000228820 -0.055395912
## 10 EH38E3440167_ENSG00000232358 -0.128205852
## 11 EH38E3440167_ENSG00000252467 -0.015644308
## 12 EH38E3440167_ENSG00000252654 -0.143228865
## 13 EH38E3440167_ENSG00000266761 0.186533710
## 14 EH38E3440167_ENSG00000279835 -0.050432755
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] CENTREprecomputed_0.99.2 ExperimentHub_2.99.5 AnnotationHub_3.99.6
## [4] BiocFileCache_2.99.5 dbplyr_2.5.0 BiocGenerics_0.55.1
## [7] generics_0.1.4 BiocStyle_2.37.1
##
## loaded via a namespace (and not attached):
## [1] rappdirs_0.3.3 sass_0.4.10 BiocVersion_3.22.0
## [4] RSQLite_2.4.2 digest_0.6.37 magrittr_2.0.3
## [7] evaluate_1.0.4 bookdown_0.43 fastmap_1.2.0
## [10] blob_1.2.4 jsonlite_2.0.0 AnnotationDbi_1.71.1
## [13] DBI_1.2.3 BiocManager_1.30.26 httr_1.4.7
## [16] purrr_1.1.0 Biostrings_2.77.2 httr2_1.2.1
## [19] jquerylib_0.1.4 cli_3.6.5 crayon_1.5.3
## [22] rlang_1.1.6 XVector_0.49.0 Biobase_2.69.0
## [25] bit64_4.6.0-1 withr_3.0.2 cachem_1.1.0
## [28] yaml_2.3.10 tools_4.5.1 memoise_2.0.1
## [31] dplyr_1.1.4 filelock_1.0.3 curl_6.4.0
## [34] vctrs_0.6.5 R6_2.6.1 png_0.1-8
## [37] stats4_4.5.1 lifecycle_1.0.4 Seqinfo_0.99.2
## [40] KEGGREST_1.49.1 S4Vectors_0.47.0 IRanges_2.43.0
## [43] bit_4.6.0 pkgconfig_2.0.3 pillar_1.11.0
## [46] bslib_0.9.0 glue_1.8.0 xfun_0.52
## [49] tibble_3.3.0 tidyselect_1.2.1 knitr_1.50
## [52] htmltools_0.5.8.1 rmarkdown_2.29 compiler_4.5.1