In a pair of papers from the Ventner Institute, Bakken et al. and Aevermann et al. discuss ontological implications of single-cell transcriptomics. A process of cell type definition via “necessary and sufficient marker gene” enumeration is introduced.
In this vignette we indicate how Cell Ontology, Relational Ontology, and Protein Ontology can be connected to assess formal relationships between declared cell types and plasma membrane features that can play a role in cell type definition.
Connect to the relational ontology and search for CURIEs related to “plasma membrane”.
library(ontoProc2)
library(DT)
ro <- semsql_connect(ontology = "ro")
search_labels(ro, "plasma membrane")## subject label
## 1 RO:0002104 has plasma membrane part
## 2 RO:0015015 has high plasma membrane amount
## 3 RO:0015016 has low plasma membrane amount
We have a helper resource for finding exact Cell Ontology names of cell types.
data("tag2cn", package = "ontoProc2")
cd8reg <- grep("CD8-positive.*regulatory", tag2cn, value = TRUE)
cd8reg## CL:0000795
## "CD8-positive, alpha-beta regulatory T cell"
## CL:0000919
## "CD8-positive, CD25-positive, alpha-beta regulatory T cell"
## CL:0000920
## "CD8-positive, CD28-negative, alpha-beta regulatory T cell"
## CL:0001041
## "CD8-positive, CXCR3-positive, alpha-beta regulatory T cell"
Now with these cell type identifiers, we can search for the proteins identified as “part of plasma membrane”. We need to use the CURIEs for precision. THIS IS BLOCKED UNTIL WE HAVE A SUBSET OF PR DATA TO ILLUSTRATE AS THE PR DOWNLOADS ARE TOO SLOW.
We pick two proteins and look for associated cell types. BLOCKED AS ABOVE.
The “entailed edge” table of the Semantic SQL representation of Cell Ontology includes all assertions that are derivable from base axioms of the ontology.
## <SemsqlConn> prefix: CL | labeled terms: 22,298
## # Source: table<`entailed_edge`> [?? x 3]
## # Database: sqlite 3.51.2 [/home/pkgbuild/.cache/R/BiocFileCache/17102851e4bb87_cl.db]
## subject predicate object
## <chr> <chr> <chr>
## 1 UBERON:0001772 rdfs:subClassOf UBERON:0001772
## 2 UBERON:0019190 rdfs:subClassOf UBERON:0019190
## 3 GO:0051034 rdfs:subClassOf GO:0051033
## 4 GO:0051033 rdfs:subClassOf GO:0051033
## 5 GO:1904522 rdfs:subClassOf GO:1904522
## 6 UBERON:0018685 rdfs:subClassOf UBERON:0018685
## 7 GO:0106027 rdfs:subClassOf GO:0106027
## 8 GO:0050679 rdfs:subClassOf GO:0050679
## 9 GO:1901647 rdfs:subClassOf GO:0050679
## 10 GO:1904692 rdfs:subClassOf GO:0050679
## # ℹ more rows
## # Source: SQL [?? x 1]
## # Database: sqlite 3.51.2 [/home/pkgbuild/.cache/R/BiocFileCache/17102851e4bb87_cl.db]
## n
## <int>
## 1 2966269
We can look for statements that have “RO:0002104” as predicate:
tbl(cl@con, "entailed_edge") |>
filter(predicate == "RO:0002104") |>
as.data.frame() |>
filter(grepl("PR:", object)) |>
arrange(subject) |>
datatable()Disconnect databases.
## Disconnected from '17102851e4bb87_cl.db'
## Disconnected from '1710282ff82f0c_ro.db'
## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dplyr_1.2.1 DT_0.34.0 ontoProc2_0.99.17 BiocStyle_2.40.0
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.6 rappdirs_0.3.4 sass_0.4.10
## [4] generics_0.1.4 RSQLite_2.4.6 digest_0.6.39
## [7] magrittr_2.0.5 evaluate_1.0.5 grid_4.6.0
## [10] bookdown_0.46 fastmap_1.2.0 blob_1.3.0
## [13] R.oo_1.27.1 jsonlite_2.0.0 R.utils_2.13.0
## [16] ontologyIndex_2.12 ontologyPlot_1.7 graph_1.90.0
## [19] DBI_1.3.0 BiocManager_1.30.27 purrr_1.2.2
## [22] crosstalk_1.2.2 Rgraphviz_2.56.0 httr2_1.2.2
## [25] jquerylib_0.1.4 paintmap_1.0 cli_3.6.6
## [28] rlang_1.2.0 dbplyr_2.5.2 R.methodsS3_1.8.2
## [31] bit64_4.8.0 withr_3.0.2 cachem_1.1.0
## [34] yaml_2.3.12 otel_0.2.0 tools_4.6.0
## [37] memoise_2.0.1 filelock_1.0.3 BiocGenerics_0.58.0
## [40] curl_7.1.0 vctrs_0.7.3 R6_2.6.1
## [43] stats4_4.6.0 BiocFileCache_3.2.0 lifecycle_1.0.5
## [46] htmlwidgets_1.6.4 bit_4.6.0 pkgconfig_2.0.3
## [49] pillar_1.11.1 bslib_0.10.0 glue_1.8.1
## [52] xfun_0.57 tibble_3.3.1 tidyselect_1.2.1
## [55] knitr_1.51 htmltools_0.5.9 rmarkdown_2.31
## [58] compiler_4.6.0 S7_0.2.2