--- title: > scaeData User Guide author: - name: Ahmad Al Ajami affiliation: - Goethe University, University Hospital Frankfurt, Neurological Institute (Edinger Institute), Frankfurt/Main - University Cancer Center, Frankfurt/Main - Frankfurt Cancer Institute, Frankfurt/Main email: alajami@med.uni-frankfurt.de - name: Jonas Schuck affiliation: - Goethe University, University Hospital Frankfurt, Neurological Institute (Edinger Institute), Frankfurt/Main - University Cancer Center, Frankfurt/Main - Frankfurt Cancer Institute, Frankfurt/Main email: schuck@med.uni-frankfurt.de - name: Federico Marini affiliation: - Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), Mainz - Research Center for Immunotherapy (FZI), Mainz email: marinif@uni-mainz.de - name: Katharina Imkeller affiliation: - Goethe University, University Hospital Frankfurt, Neurological Institute (Edinger Institute), Frankfurt/Main - University Cancer Center, Frankfurt/Main - Frankfurt Cancer Institute, Frankfurt/Main email: imkeller@med.uni-frankfurt.de date: "`r BiocStyle::doc_date()`" package: "`r BiocStyle::pkg_ver('scaeData')`" output: BiocStyle::html_document: toc: true toc_float: true number_sections: true vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{scaeData User Guide} %\VignetteEncoding[utf8]{inputenc} %\VignettePackage{scaeData} %\VignetteKeywords{ExperimentHub, ExperimentData, Homo_sapiens_Data, SingleCellData} --- # scaeData `scaeData` is a complementary package to the Bioconductor package `SingleCellAlleleExperiment`. It contains three datasets to be used when testing functions in `SingleCellAlleleExperiment`. These are: - 5k PBMCs of a healthy donor, 3' v3 chemistry - 10k PBMCs of a healthy donor, 3' v3 chemistry - 20k PBMCs of a healthy donor, 3' v3 chemistry The raw FASTQs for all three datasets were sourced from publicly accessible datasets provided by [10x Genomics](https://www.10xgenomics.com/datasets). After downloading the raw data, the [scIGD](https://github.com/AGImkeller/scIGD) Snakemake workflow was utilized to perform HLA allele-typing processes and generate allele-specific quantification from scRNA-seq data using donor-specific references. # Quick Start ## Installation From Bioconductor: ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("scaeData") ``` Alternatively, a development version is available on GitHub and can be installed via: ```{r, eval=FALSE} if (!require("devtools", quietly = TRUE)) install.packages("devtools") devtools::install_github("AGImkeller/scaeData", build_vignettes = TRUE) ``` # Usage The datasets within `scaeData` are accessible using the `scaeDataGet()` function: ```{r libraries, include=TRUE} library("scaeData") ``` ```{r, eval = FALSE} pbmc_5k <- scaeDataGet("pbmc_5k") pbmc_10k <- scaeDataGet("pbmc_10k") ``` For example, we can view `pbmc_20k`: ```{r} pbmc_20k <- scaeDataGet("pbmc_20k") pbmc_20k ``` ```{r} cells.dir <- file.path(pbmc_20k$dir, pbmc_20k$barcodes) features.dir <- file.path(pbmc_20k$dir, pbmc_20k$features) mat.dir <- file.path(pbmc_20k$dir, pbmc_20k$matrix) cells <- utils::read.csv(cells.dir, sep = "", header = FALSE) features <- utils::read.delim(features.dir, header = FALSE) mat <- Matrix::readMM(mat.dir) rownames(mat) <- cells$V1 colnames(mat) <- features$V1 head(mat) ``` A `SingleCellAlleleExperiment` object, `scae` for short, can be generated using the `read_allele_counts()` function retrieved from the `SingleCellAlleleExperiment` package. A lookup table corresponding to each dataset, facilitating the creation of relevant additional data layers during object generation, can be accessed from the package's extdata: ```{r} lookup <- read.csv(system.file("extdata", "pbmc_20k_lookup_table.csv", package="scaeData")) library("SingleCellAlleleExperiment") scae_20k <- read_allele_counts(pbmc_20k$dir, sample_names = "example_data", filter_mode = "no", lookup_file = lookup, barcode_file = pbmc_20k$barcodes, gene_file = pbmc_20k$features, matrix_file = pbmc_20k$matrix, verbose = TRUE) scae_20k ``` Please refer to the vignette and documentation of `SingleCellAlleleExperiment` to further work with this kind of data container. # Session info {-} ```{r sessionInfo} sessionInfo() ```