Contents

1 Getting started

By default to launch atacInferCnv, the required input are:

To demonstrate how to apply atacInferCNV scAtac-seq data we prepared example test set from a medulloblastoma Group 3/4 MYCN tumor case snMultiomics-seq data.

The default input data for the package are file path to raw signals matrix in text format and file path to cells annotation. The annotation of cells is required to distinguish tumor vs normal cells, that are applied as reference control. Typical format of annotation is a tab-delimited table with row names (cell IDs) and only with column names in the first line.

For input matrix other data formats are also supported, as described further, but here we will start with the path to the standard matrix and annotation in text format. For the launch also the path and name of the result should be stated. We will use temporary folder for this:

library(atacInferCnv)
inPath = system.file("extdata", "MB183_ATAC_subset.tsv.gz", 
                     package = "atacInferCnv")
sAnn = system.file("extdata", "MB183_ATAC_subset.CNV_blocks_ann.txt", 
                   package = "atacInferCnv" )
resPath = tempfile() 

atacInferCNV analysis is based on two main steps: 1) process the input data, adjust it for CNV calling and 2) launch inferCnv on this custom generated input.

First step is the function that processes input data to prepare it for InferCnv. It has multiple options, but requires name of the column in the annotation to distinguish tumor vs normal as well as the name of normal cell type that will be used as a reference in case if normal cells are present in the input dataset. This command performs specific processing for the data based on usage of Signac R package application on scATAC-seq data and prepares the input for InferCnv:

prepareAtacInferCnvInput(inPath,sAnn,resPath,
                         targColumn = "cnvBlock",
                          ctrlGrp = "Normal")
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 300
## Number of edges: 11598
## 
## Running smart local moving algorithm...
## Maximum modularity in 10 random starts: 0.5614
## Number of communities: 4
## Elapsed time: 0 seconds

The results are saved in a specific format. Main generated output is saved in result folder:

list.files(resPath)
## [1] "infercnv_config.yml"      "sample_UMAP.pdf"         
## [3] "sample_cnv_ann.txt"       "sample_cnv_ref.txt"      
## [5] "sample_obj.RDS"           "sample_raw_counts.txt.gz"

It includes signal matrix per cell (sample_raw_counts.txt.gz), cells annotation (sample_cnv_ann.txt) and peaks genomic regions (sample_cnv_ref.txt) info as well as configuration for InferCnv (infercnv_config.yml). Additional output includes Signac/Seurat RDS object (sample_obj.RDS) and UMAP visualization of the input data based on the annotation (sample_UMAP.pdf) for visual inspection.

The second and final step is the wrapper function to launch InferCnv. It uses the generated configuration inside result path to customize the input:

iObj <- runAtacInferCnv(resPath,returnObj = TRUE)

By default the function returns no output, but it’s possible to return infercnv object to fully reflect command infercnv::run() using corresponding argument returnObj. In any case, all the generated output is saved inside the result path as infercnv subfolder:

list.files(file.path(resPath,"sample_infercnv"))
##  [1] "01_incoming_data.infercnv_obj"                       
##  [2] "02_reduced_by_cutoff.infercnv_obj"                   
##  [3] "03_normalized_by_depth.infercnv_obj"                 
##  [4] "04_logtransformed.infercnv_obj"                      
##  [5] "08_remove_ref_avg_from_obs_logFC.infercnv_obj"       
##  [6] "09_apply_max_centered_expr_threshold.infercnv_obj"   
##  [7] "10_smoothed_by_chr.infercnv_obj"                     
##  [8] "11_recentered_cells_by_chr.infercnv_obj"             
##  [9] "12_remove_ref_avg_from_obs_adjust.infercnv_obj"      
## [10] "14_invert_log_transform.infercnv_obj"                
## [11] "15_tumor_subclusters.leiden.infercnv_obj"            
## [12] "22_denoise.leiden.NF_NA.SD_1.5.NL_FALSE.infercnv_obj"
## [13] "infercnv.heatmap_thresholds.txt"                     
## [14] "infercnv.observation_groupings.txt"                  
## [15] "infercnv.pdf"                                        
## [16] "infercnv.preliminary.heatmap_thresholds.txt"         
## [17] "infercnv.preliminary.observation_groupings.txt"      
## [18] "infercnv.preliminary.pdf"                            
## [19] "infercnv_subclusters.heatmap_thresholds.txt"         
## [20] "infercnv_subclusters.observation_groupings.txt"      
## [21] "infercnv_subclusters.png"                            
## [22] "preliminary.infercnv_obj"                            
## [23] "run.final.infercnv_obj"

The function has specific parameters that require adjustment for ATAC-data such as for example number of clusters (numClusters). Importantly, it supports all options of the original inferCnv functions. These details as well as description of output could be checked from InferCnv documentation.

atacInferCnv can also generate pseudo-bulk CNV images from the assigned in annotation (if numClusters = 1 assigned in runAtacInferCnv()) or identified subclones (if numClusters > 1 in the same function). For this purpose use the corresponding function after performing the analysis:

plots <- plotCnvBlocks(resPath,iObj)
plots[["C1"]]

plots[["C2"]]

These figures show CNV patterns for subclones without (C1) and with (C1) MYCN amplification in chr2.

2 Custom settings

The tool also provides various custom options to control the analysis. For example it supports as input 10X Multiomics or single cell ATAC processed data from CellRanger. If general scATAC anlaysis was already done by Signac/Seurat or some other tool, then created input Seurat or SingleCellExperiment object can be re-used without performing additional analysis to create it. This could be especially useful for merged or large samples and atacInferCnv has corresponding argument. Moreover several useful features such as external reference, meta-cell or genome binning are also available.

We provide additional detailed tutorial and information at the project wiki page.

3 Applications

Initial example application of the method was described in the following manuscript:

K. Okonechnikov et al “Oncogene aberrations drive medulloblastoma progression, not initiation”, Nature, 2025

Session info

## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] future_1.70.0       atacInferCnv_0.99.7 BiocStyle_2.39.0   
## 
## loaded via a namespace (and not attached):
##   [1] RcppAnnoy_0.0.23            splines_4.6.0              
##   [3] later_1.4.8                 bitops_1.0-9               
##   [5] tibble_3.3.1                polyclip_1.10-7            
##   [7] fastDummies_1.7.6           lifecycle_1.0.5            
##   [9] fastcluster_1.3.0           edgeR_4.9.9                
##  [11] doParallel_1.0.17           globals_0.19.1             
##  [13] lattice_0.22-9              MASS_7.3-65                
##  [15] magrittr_2.0.5              limma_3.67.3               
##  [17] plotly_4.12.0               sass_0.4.10                
##  [19] rmarkdown_2.31              jquerylib_0.1.4            
##  [21] yaml_2.3.12                 httpuv_1.6.17              
##  [23] otel_0.2.0                  Seurat_5.5.0               
##  [25] sctransform_0.4.3           spam_2.11-3                
##  [27] sp_2.2-1                    spatstat.sparse_3.1-0      
##  [29] reticulate_1.46.0           cowplot_1.2.0              
##  [31] pbapply_1.7-4               RColorBrewer_1.1-3         
##  [33] multcomp_1.4-30             abind_1.4-8                
##  [35] Rtsne_0.17                  GenomicRanges_1.63.2       
##  [37] purrr_1.2.2                 BiocGenerics_0.57.1        
##  [39] TH.data_1.1-5               sandwich_3.1-1             
##  [41] IRanges_2.45.0              S4Vectors_0.49.2           
##  [43] ggrepel_0.9.8               irlba_2.3.7                
##  [45] listenv_0.10.1              spatstat.utils_3.2-2       
##  [47] goftest_1.2-3               RSpectra_0.16-2            
##  [49] spatstat.random_3.4-5       fitdistrplus_1.2-6         
##  [51] parallelly_1.47.0           codetools_0.2-20           
##  [53] coin_1.4-3                  DelayedArray_0.37.1        
##  [55] RcppRoll_0.3.2              tidyselect_1.2.1           
##  [57] futile.logger_1.4.9         UCSC.utils_1.7.1           
##  [59] farver_2.1.2                rjags_4-17                 
##  [61] matrixStats_1.5.0           stats4_4.6.0               
##  [63] spatstat.explore_3.8-0      Seqinfo_1.1.0              
##  [65] jsonlite_2.0.0              progressr_0.19.0           
##  [67] ggridges_0.5.7              survival_3.8-6             
##  [69] iterators_1.0.14            foreach_1.5.2              
##  [71] tools_4.6.0                 ica_1.0-3                  
##  [73] Rcpp_1.1.1-1.1              glue_1.8.1                 
##  [75] gridExtra_2.3               SparseArray_1.11.13        
##  [77] xfun_0.57                   MatrixGenerics_1.23.0      
##  [79] GenomeInfoDb_1.47.2         dplyr_1.2.1                
##  [81] withr_3.0.2                 formatR_1.14               
##  [83] BiocManager_1.30.27         fastmap_1.2.0              
##  [85] caTools_1.18.3              digest_0.6.39              
##  [87] parallelDist_0.2.7          R6_2.6.1                   
##  [89] mime_0.13                   scattermore_1.2            
##  [91] gtools_3.9.5                tensor_1.5.1               
##  [93] dichromat_2.0-0.1           spatstat.data_3.1-9        
##  [95] config_0.3.2                tidyr_1.3.2                
##  [97] generics_0.1.4              data.table_1.18.2.1        
##  [99] httr_1.4.8                  htmlwidgets_1.6.4          
## [101] S4Arrays_1.11.1             infercnv_1.27.0            
## [103] uwot_0.2.4                  pkgconfig_2.0.3            
## [105] gtable_0.3.6                modeltools_0.2-24          
## [107] lmtest_0.9-40               S7_0.2.2                   
## [109] SingleCellExperiment_1.33.2 XVector_0.51.0             
## [111] htmltools_0.5.9             dotCall64_1.2              
## [113] bookdown_0.46               SeuratObject_5.4.0         
## [115] scales_1.4.0                Biobase_2.71.0             
## [117] png_0.1-9                   phyclust_0.1-34            
## [119] spatstat.univar_3.1-7       knitr_1.51                 
## [121] lambda.r_1.2.4              Signac_1.17.1              
## [123] reshape2_1.4.5              coda_0.19-4.1              
## [125] nlme_3.1-169                cachem_1.1.0               
## [127] zoo_1.8-15                  stringr_1.6.0              
## [129] KernSmooth_2.23-26          libcoin_1.0-12             
## [131] parallel_4.6.0              miniUI_0.1.2               
## [133] pillar_1.11.1               grid_4.6.0                 
## [135] vctrs_0.7.3                 gplots_3.3.0               
## [137] RANN_2.6.2                  promises_1.5.0             
## [139] xtable_1.8-8                cluster_2.1.8.2            
## [141] evaluate_1.0.5              magick_2.9.1               
## [143] tinytex_0.59                locfit_1.5-9.12            
## [145] mvtnorm_1.3-7               cli_3.6.6                  
## [147] compiler_4.6.0              futile.options_1.0.1       
## [149] Rsamtools_2.27.2            rlang_1.2.0                
## [151] crayon_1.5.3                future.apply_1.20.2        
## [153] labeling_0.4.3              argparse_2.3.1             
## [155] plyr_1.8.9                  stringi_1.8.7              
## [157] viridisLite_0.4.3           deldir_2.0-4               
## [159] BiocParallel_1.45.0         Biostrings_2.79.5          
## [161] lazyeval_0.2.3              spatstat.geom_3.7-3        
## [163] Matrix_1.7-5                RcppHNSW_0.6.0             
## [165] patchwork_1.3.2             sparseMatrixStats_1.23.0   
## [167] ggplot2_4.0.3               statmod_1.5.1              
## [169] shiny_1.13.0                SummarizedExperiment_1.41.1
## [171] ROCR_1.0-12                 igraph_2.3.0               
## [173] RcppParallel_5.1.11-2       bslib_0.10.0               
## [175] fastmatch_1.1-8             ape_5.8-1