The package prepares input from scATAC-seq data and adapts it for copy number variance profiling with InferCNV toolkit. It has also various paramters to control the analysis (e.g. external reference, metacells formation, bin size, etc) and custom plot visualizations.
atacInferCnv 0.99.7
By default to launch atacInferCnv, the required input are:
To demonstrate how to apply atacInferCNV scAtac-seq data we prepared example test set from a medulloblastoma Group 3/4 MYCN tumor case snMultiomics-seq data.
The default input data for the package are file path to raw signals matrix in text format and file path to cells annotation. The annotation of cells is required to distinguish tumor vs normal cells, that are applied as reference control. Typical format of annotation is a tab-delimited table with row names (cell IDs) and only with column names in the first line.
For input matrix other data formats are also supported, as described further, but here we will start with the path to the standard matrix and annotation in text format. For the launch also the path and name of the result should be stated. We will use temporary folder for this:
library(atacInferCnv)
inPath = system.file("extdata", "MB183_ATAC_subset.tsv.gz",
package = "atacInferCnv")
sAnn = system.file("extdata", "MB183_ATAC_subset.CNV_blocks_ann.txt",
package = "atacInferCnv" )
resPath = tempfile()
atacInferCNV analysis is based on two main steps: 1) process the input data, adjust it for CNV calling and 2) launch inferCnv on this custom generated input.
First step is the function that processes input data to prepare it for InferCnv. It has multiple options, but requires name of the column in the annotation to distinguish tumor vs normal as well as the name of normal cell type that will be used as a reference in case if normal cells are present in the input dataset. This command performs specific processing for the data based on usage of Signac R package application on scATAC-seq data and prepares the input for InferCnv:
prepareAtacInferCnvInput(inPath,sAnn,resPath,
targColumn = "cnvBlock",
ctrlGrp = "Normal")
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 300
## Number of edges: 11598
##
## Running smart local moving algorithm...
## Maximum modularity in 10 random starts: 0.5614
## Number of communities: 4
## Elapsed time: 0 seconds
The results are saved in a specific format. Main generated output is saved in result folder:
list.files(resPath)
## [1] "infercnv_config.yml" "sample_UMAP.pdf"
## [3] "sample_cnv_ann.txt" "sample_cnv_ref.txt"
## [5] "sample_obj.RDS" "sample_raw_counts.txt.gz"
It includes signal matrix per cell (sample_raw_counts.txt.gz), cells annotation (sample_cnv_ann.txt) and peaks genomic regions (sample_cnv_ref.txt) info as well as configuration for InferCnv (infercnv_config.yml). Additional output includes Signac/Seurat RDS object (sample_obj.RDS) and UMAP visualization of the input data based on the annotation (sample_UMAP.pdf) for visual inspection.
The second and final step is the wrapper function to launch InferCnv. It uses the generated configuration inside result path to customize the input:
iObj <- runAtacInferCnv(resPath,returnObj = TRUE)
By default the function returns no output, but it’s possible to return infercnv object to fully reflect command infercnv::run() using corresponding argument returnObj. In any case, all the generated output is saved inside the result path as infercnv subfolder:
list.files(file.path(resPath,"sample_infercnv"))
## [1] "01_incoming_data.infercnv_obj"
## [2] "02_reduced_by_cutoff.infercnv_obj"
## [3] "03_normalized_by_depth.infercnv_obj"
## [4] "04_logtransformed.infercnv_obj"
## [5] "08_remove_ref_avg_from_obs_logFC.infercnv_obj"
## [6] "09_apply_max_centered_expr_threshold.infercnv_obj"
## [7] "10_smoothed_by_chr.infercnv_obj"
## [8] "11_recentered_cells_by_chr.infercnv_obj"
## [9] "12_remove_ref_avg_from_obs_adjust.infercnv_obj"
## [10] "14_invert_log_transform.infercnv_obj"
## [11] "15_tumor_subclusters.leiden.infercnv_obj"
## [12] "22_denoise.leiden.NF_NA.SD_1.5.NL_FALSE.infercnv_obj"
## [13] "infercnv.heatmap_thresholds.txt"
## [14] "infercnv.observation_groupings.txt"
## [15] "infercnv.pdf"
## [16] "infercnv.preliminary.heatmap_thresholds.txt"
## [17] "infercnv.preliminary.observation_groupings.txt"
## [18] "infercnv.preliminary.pdf"
## [19] "infercnv_subclusters.heatmap_thresholds.txt"
## [20] "infercnv_subclusters.observation_groupings.txt"
## [21] "infercnv_subclusters.png"
## [22] "preliminary.infercnv_obj"
## [23] "run.final.infercnv_obj"
The function has specific parameters that require adjustment for ATAC-data such as for example number of clusters (numClusters). Importantly, it supports all options of the original inferCnv functions. These details as well as description of output could be checked from InferCnv documentation.
atacInferCnv can also generate pseudo-bulk CNV images from the assigned in annotation (if numClusters = 1 assigned in runAtacInferCnv()) or identified subclones (if numClusters > 1 in the same function). For this purpose use the corresponding function after performing the analysis:
plots <- plotCnvBlocks(resPath,iObj)
plots[["C1"]]
plots[["C2"]]
These figures show CNV patterns for subclones without (C1) and with (C1) MYCN amplification in chr2.
The tool also provides various custom options to control the analysis. For example it supports as input 10X Multiomics or single cell ATAC processed data from CellRanger. If general scATAC anlaysis was already done by Signac/Seurat or some other tool, then created input Seurat or SingleCellExperiment object can be re-used without performing additional analysis to create it. This could be especially useful for merged or large samples and atacInferCnv has corresponding argument. Moreover several useful features such as external reference, meta-cell or genome binning are also available.
We provide additional detailed tutorial and information at the project wiki page.
Initial example application of the method was described in the following manuscript:
## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] future_1.70.0 atacInferCnv_0.99.7 BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] RcppAnnoy_0.0.23 splines_4.6.0
## [3] later_1.4.8 bitops_1.0-9
## [5] tibble_3.3.1 polyclip_1.10-7
## [7] fastDummies_1.7.6 lifecycle_1.0.5
## [9] fastcluster_1.3.0 edgeR_4.9.9
## [11] doParallel_1.0.17 globals_0.19.1
## [13] lattice_0.22-9 MASS_7.3-65
## [15] magrittr_2.0.5 limma_3.67.3
## [17] plotly_4.12.0 sass_0.4.10
## [19] rmarkdown_2.31 jquerylib_0.1.4
## [21] yaml_2.3.12 httpuv_1.6.17
## [23] otel_0.2.0 Seurat_5.5.0
## [25] sctransform_0.4.3 spam_2.11-3
## [27] sp_2.2-1 spatstat.sparse_3.1-0
## [29] reticulate_1.46.0 cowplot_1.2.0
## [31] pbapply_1.7-4 RColorBrewer_1.1-3
## [33] multcomp_1.4-30 abind_1.4-8
## [35] Rtsne_0.17 GenomicRanges_1.63.2
## [37] purrr_1.2.2 BiocGenerics_0.57.1
## [39] TH.data_1.1-5 sandwich_3.1-1
## [41] IRanges_2.45.0 S4Vectors_0.49.2
## [43] ggrepel_0.9.8 irlba_2.3.7
## [45] listenv_0.10.1 spatstat.utils_3.2-2
## [47] goftest_1.2-3 RSpectra_0.16-2
## [49] spatstat.random_3.4-5 fitdistrplus_1.2-6
## [51] parallelly_1.47.0 codetools_0.2-20
## [53] coin_1.4-3 DelayedArray_0.37.1
## [55] RcppRoll_0.3.2 tidyselect_1.2.1
## [57] futile.logger_1.4.9 UCSC.utils_1.7.1
## [59] farver_2.1.2 rjags_4-17
## [61] matrixStats_1.5.0 stats4_4.6.0
## [63] spatstat.explore_3.8-0 Seqinfo_1.1.0
## [65] jsonlite_2.0.0 progressr_0.19.0
## [67] ggridges_0.5.7 survival_3.8-6
## [69] iterators_1.0.14 foreach_1.5.2
## [71] tools_4.6.0 ica_1.0-3
## [73] Rcpp_1.1.1-1.1 glue_1.8.1
## [75] gridExtra_2.3 SparseArray_1.11.13
## [77] xfun_0.57 MatrixGenerics_1.23.0
## [79] GenomeInfoDb_1.47.2 dplyr_1.2.1
## [81] withr_3.0.2 formatR_1.14
## [83] BiocManager_1.30.27 fastmap_1.2.0
## [85] caTools_1.18.3 digest_0.6.39
## [87] parallelDist_0.2.7 R6_2.6.1
## [89] mime_0.13 scattermore_1.2
## [91] gtools_3.9.5 tensor_1.5.1
## [93] dichromat_2.0-0.1 spatstat.data_3.1-9
## [95] config_0.3.2 tidyr_1.3.2
## [97] generics_0.1.4 data.table_1.18.2.1
## [99] httr_1.4.8 htmlwidgets_1.6.4
## [101] S4Arrays_1.11.1 infercnv_1.27.0
## [103] uwot_0.2.4 pkgconfig_2.0.3
## [105] gtable_0.3.6 modeltools_0.2-24
## [107] lmtest_0.9-40 S7_0.2.2
## [109] SingleCellExperiment_1.33.2 XVector_0.51.0
## [111] htmltools_0.5.9 dotCall64_1.2
## [113] bookdown_0.46 SeuratObject_5.4.0
## [115] scales_1.4.0 Biobase_2.71.0
## [117] png_0.1-9 phyclust_0.1-34
## [119] spatstat.univar_3.1-7 knitr_1.51
## [121] lambda.r_1.2.4 Signac_1.17.1
## [123] reshape2_1.4.5 coda_0.19-4.1
## [125] nlme_3.1-169 cachem_1.1.0
## [127] zoo_1.8-15 stringr_1.6.0
## [129] KernSmooth_2.23-26 libcoin_1.0-12
## [131] parallel_4.6.0 miniUI_0.1.2
## [133] pillar_1.11.1 grid_4.6.0
## [135] vctrs_0.7.3 gplots_3.3.0
## [137] RANN_2.6.2 promises_1.5.0
## [139] xtable_1.8-8 cluster_2.1.8.2
## [141] evaluate_1.0.5 magick_2.9.1
## [143] tinytex_0.59 locfit_1.5-9.12
## [145] mvtnorm_1.3-7 cli_3.6.6
## [147] compiler_4.6.0 futile.options_1.0.1
## [149] Rsamtools_2.27.2 rlang_1.2.0
## [151] crayon_1.5.3 future.apply_1.20.2
## [153] labeling_0.4.3 argparse_2.3.1
## [155] plyr_1.8.9 stringi_1.8.7
## [157] viridisLite_0.4.3 deldir_2.0-4
## [159] BiocParallel_1.45.0 Biostrings_2.79.5
## [161] lazyeval_0.2.3 spatstat.geom_3.7-3
## [163] Matrix_1.7-5 RcppHNSW_0.6.0
## [165] patchwork_1.3.2 sparseMatrixStats_1.23.0
## [167] ggplot2_4.0.3 statmod_1.5.1
## [169] shiny_1.13.0 SummarizedExperiment_1.41.1
## [171] ROCR_1.0-12 igraph_2.3.0
## [173] RcppParallel_5.1.11-2 bslib_0.10.0
## [175] fastmatch_1.1-8 ape_5.8-1