Contents

1 Introduction

Circular Chromosome Conformation Capture Sequencing (4C-seq) is a sequencing technique enabling the identification of chromatin interactions. Currently, there are several tools available for 4C-seq analysis:

A benchmarking of those tools revealed that none of the tools performed adequately in all evaluated tasks (Walter et al. 2019). To address this we developed fourSynergy, a ensemble algorithm that leverages synergies between the base tools r3cseq, fourSig, peakC, and R.4cker. To this end, all of these tools need to be run, which can be achieved using the fourSynergy pipeline. This pipeline, written in snakemake, includes quality control, interaction calling, and post-processing steps to prepare the output of the base tools for the ensemble algorithm.

2 Requirements

If you want to quickly start with testing the package you can jump to the section R analysis and run the fourSynergy R package on testdata.

2.1 Pipeline

  • snakemake installation

We recommend the pipeline to be executed in a docker.

3 Running fourSynergy

3.1 Installation

fourSynergy can be installed as follows:

BiocManager::install("fourSynergy")

3.2 Requirements

To run fourSynergy, you need R (Version 4.5 or higher).

3.3 Getting started

A basic workflow of 4C-seq data is demonstrated in the following on an exemplary dataset. First, we need to run the fourSynergy pipeline.

If you want to quickly start with testing the package you can jump to the section R analysis and run the fourSynergy R package on testdata.

3.4 fourSynergy Pipeline

Before starting the R analysis you have to run the fourSynergy snakemake pipeline, which is available at (https://github.com/sophiewind/fourSynergy_pip).

3.4.1 Input

Input for the fourSynergy pipeline are…

  • 4C-seq FASTQ files

  • config.yaml

  • Reference genome

3.4.1.1 config

The config file is crucial to run fourSynergy. All relevant experimental information is stored here. You can find an exemplary info.yaml in inst/extdata/Datasets/Demo/info.yaml.

3.4.2 Output

The pipeline creates a results directory in the directory were its executed. In the results directory is a directory called like the author provided in the config file. In this directory, all results of the base tools and fourSynergy are written.

├── Datasets
├── ...
└── results
    └── author
        ├── alignment   -> alignment results
        ├── basic4Cseq  -> Basic4Cseq results
        ├── foursig     -> fourSig results
        ├── peakC       -> peakC results
        ├── qc          -> FASTQC results
        ├── r3cseq      -> r3Cseq results
        ├── r4cker      -> R.4Cker results
        └── sia         -> standardized tools results

4 R analysis

After the pipeline is executed we can start with the analysis in R by loading the fourSynergy R package.

library(fourSynergy)

4.1 Load data

Than the pipeline results can be loaded into an fourSynergy object. In the provided example, config and the pathes are stored in extdata. However, in a real-world scenario, res_path would typically be located in ./results/[DatasetName] and the tracks in ./results/[DatasetName]/alignment.

# Load config
config <- system.file("extdata", "Datasets", "Demo", "info.yaml",
    package = "fourSynergy"
)

# Load results path
res_path <- system.file("extdata", "results", "Demo",
    package = "fourSynergy"
)

# Load path of .bedGraphs
tracks <- system.file("extdata", "results", "Demo", "alignment",
    package = "fourSynergy"
)
sia <- createIa(
    res_path = res_path,
    config = config,
    tracks = tracks
)

The fourSynergy object stores information about the single tool calls in condition and in control as well as metadata.

Slot Data
metadata Metadata from config
expInteractions Base tool results - experiment
ctrlInteractions Base tool results - control
expConsensus Ensemble results - experiment
ctrlConsensus Ensemble results - control
vp Viewpoint position
vfl Virtual fragment library (VFL)
tracks Path of .bedGraphs
differential Differential ensemble results

4.2 Base tool results

The base tool results are stored in the fourSynergy object in the slots expInteractions and ctrlInteractions.

slotNames(sia)
#>  [1] "metadata"         "expInteractions"  "ctrlInteractions" "expConsensus"    
#>  [5] "ctrlConsensus"    "vp"               "vfl"              "tracks"          
#>  [9] "differential"     "dds"

They can be visualized using plotIaIndiviualTools(). There is also the object to pass genes of interest, which are than shown in the karyoplot.

plotIaIndiviualTools(sia, genes_of_interest = c("Ldlrad4", "Cep76"))

Further, the base tool results can be visualized in higher resolution using plotBaseTracks().

plotBaseTracks(sia)
#> [[1]]

#> 
#> [[2]]

4.3 Ensemble results

To perform ensemble calling you need to specify if you want to use the F1 or the AUPRC model.

sia <- consensusIa(sia, model = "AUPRC")
#> Starting ensemble algorithm...

The ensemble results can than be visualized.

plotConsensusIa(ia = sia)

The consensus results can be visualized in higher resolution using plotConsensusTracks().

plotConsensusTracks(sia)
#> [[1]]

#> 
#> [[2]]

4.4 Differential interaction analysis

fourSynergy provides differential interaction analysis to compare interactions across conditions. The differential interaction analysis is based on DESeq2 using count data to model the variability in the data using a negative binomial distribution. In the example dataset the fitType was set to ‘mean’, by default this value is ‘local’.

(The plots do not look representative here because the BAM file size has been drastically reduced for this example.)

sia <- differentialAnalysis(ia = sia, fitType = "mean")
#> Processing /tmp/Rtmpgw8Arr/Rinst18f42f22803460/fourSynergy/extdata/results/Demo/alignment/TGFb_1_sorted.bam: tHaT'S tHa fAStEsT pIlE-uP bAm iN tHe SoUth!!!
#> Processing /tmp/Rtmpgw8Arr/Rinst18f42f22803460/fourSynergy/extdata/results/Demo/alignment/TGFb_2_sorted.bam: I gEt gOoSeBuMPs wHen I seE yOu pilEuPpiNg...
#> Processing /tmp/Rtmpgw8Arr/Rinst18f42f22803460/fourSynergy/extdata/results/Demo/alignment/shc_1_sorted.bam: fOr brOoMmHiLdA!!!
#> Processing /tmp/Rtmpgw8Arr/Rinst18f42f22803460/fourSynergy/extdata/results/Demo/alignment/shc_2_sorted.bam: tHaT'S tHa fAStEsT pIlE-uP bAm iN tHe SoUth!!!
#> Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
#> design formula are characters, converting to factors
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates

The results of the differential analysis are stored in the @differential slot of the sia object and can be visualized using plotDiffIa().

plotDiffIa(ia = sia, genes_of_interest = c("Ldlrad4", "Cep76"))

sessionInfo()
#> R Under development (unstable) (2026-03-05 r89546)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] org.Mm.eg.db_3.22.0  AnnotationDbi_1.73.0 Biobase_2.71.0      
#>  [4] fourSynergy_0.99.12  GenomicRanges_1.63.1 Seqinfo_1.1.0       
#>  [7] IRanges_2.45.0       S4Vectors_0.49.0     BiocGenerics_0.57.0 
#> [10] generics_0.1.4       BiocStyle_2.39.0    
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3                       
#>   [2] rstudioapi_0.18.0                        
#>   [3] jsonlite_2.0.0                           
#>   [4] magrittr_2.0.4                           
#>   [5] magick_2.9.1                             
#>   [6] GenomicFeatures_1.63.1                   
#>   [7] farver_2.1.2                             
#>   [8] rmarkdown_2.30                           
#>   [9] BiocIO_1.21.0                            
#>  [10] vctrs_0.7.1                              
#>  [11] memoise_2.0.1                            
#>  [12] Rsamtools_2.27.1                         
#>  [13] RCurl_1.98-1.17                          
#>  [14] base64enc_0.1-6                          
#>  [15] tinytex_0.58                             
#>  [16] htmltools_0.5.9                          
#>  [17] S4Arrays_1.11.1                          
#>  [18] TxDb.Hsapiens.UCSC.hg19.knownGene_3.22.1 
#>  [19] curl_7.0.0                               
#>  [20] SparseArray_1.11.11                      
#>  [21] Formula_1.2-5                            
#>  [22] sass_0.4.10                              
#>  [23] bslib_0.10.0                             
#>  [24] htmlwidgets_1.6.4                        
#>  [25] plyr_1.8.9                               
#>  [26] cachem_1.1.0                             
#>  [27] GenomicAlignments_1.47.0                 
#>  [28] lifecycle_1.0.5                          
#>  [29] pkgconfig_2.0.3                          
#>  [30] Matrix_1.7-4                             
#>  [31] R6_2.6.1                                 
#>  [32] fastmap_1.2.0                            
#>  [33] MatrixGenerics_1.23.0                    
#>  [34] digest_0.6.39                            
#>  [35] colorspace_2.1-2                         
#>  [36] DESeq2_1.51.7                            
#>  [37] regioneR_1.43.0                          
#>  [38] bezier_1.1.2                             
#>  [39] Hmisc_5.2-5                              
#>  [40] RSQLite_2.4.6                            
#>  [41] org.Hs.eg.db_3.22.0                      
#>  [42] labeling_0.4.3                           
#>  [43] httr_1.4.8                               
#>  [44] abind_1.4-8                              
#>  [45] compiler_4.6.0                           
#>  [46] bit64_4.6.0-1                            
#>  [47] withr_3.0.2                              
#>  [48] htmlTable_2.4.3                          
#>  [49] S7_0.2.1                                 
#>  [50] backports_1.5.0                          
#>  [51] BiocParallel_1.45.0                      
#>  [52] DBI_1.3.0                                
#>  [53] DelayedArray_0.37.0                      
#>  [54] rjson_0.2.23                             
#>  [55] tools_4.6.0                              
#>  [56] foreign_0.8-91                           
#>  [57] otel_0.2.0                               
#>  [58] karyoploteR_1.37.0                       
#>  [59] nnet_7.3-20                              
#>  [60] glue_1.8.0                               
#>  [61] restfulr_0.0.16                          
#>  [62] grid_4.6.0                               
#>  [63] checkmate_2.3.4                          
#>  [64] cluster_2.1.8.2                          
#>  [65] reshape2_1.4.5                           
#>  [66] gtable_0.3.6                             
#>  [67] BSgenome_1.79.1                          
#>  [68] tidyr_1.3.2                              
#>  [69] ensembldb_2.35.0                         
#>  [70] data.table_1.18.2.1                      
#>  [71] XVector_0.51.0                           
#>  [72] pillar_1.11.1                            
#>  [73] stringr_1.6.0                            
#>  [74] dplyr_1.2.0                              
#>  [75] lattice_0.22-9                           
#>  [76] rtracklayer_1.71.3                       
#>  [77] bit_4.6.0                                
#>  [78] biovizBase_1.59.0                        
#>  [79] tidyselect_1.2.1                         
#>  [80] locfit_1.5-9.12                          
#>  [81] Biostrings_2.79.5                        
#>  [82] knitr_1.51                               
#>  [83] TxDb.Mmusculus.UCSC.mm10.knownGene_3.10.0
#>  [84] gridExtra_2.3                            
#>  [85] bookdown_0.46                            
#>  [86] ProtGenerics_1.43.0                      
#>  [87] SummarizedExperiment_1.41.1              
#>  [88] xfun_0.56                                
#>  [89] matrixStats_1.5.0                        
#>  [90] stringi_1.8.7                            
#>  [91] UCSC.utils_1.7.1                         
#>  [92] lazyeval_0.2.2                           
#>  [93] yaml_2.3.12                              
#>  [94] evaluate_1.0.5                           
#>  [95] codetools_0.2-20                         
#>  [96] cigarillo_1.1.0                          
#>  [97] tibble_3.3.1                             
#>  [98] BiocManager_1.30.27                      
#>  [99] cli_3.6.5                                
#> [100] rpart_4.1.24                             
#> [101] jquerylib_0.1.4                          
#> [102] dichromat_2.0-0.1                        
#> [103] Rcpp_1.1.1                               
#> [104] GenomeInfoDb_1.47.2                      
#> [105] png_0.1-9                                
#> [106] XML_3.99-0.22                            
#> [107] parallel_4.6.0                           
#> [108] ggplot2_4.0.2                            
#> [109] blob_1.3.0                               
#> [110] AnnotationFilter_1.35.0                  
#> [111] bitops_1.0-9                             
#> [112] VariantAnnotation_1.57.1                 
#> [113] scales_1.4.0                             
#> [114] purrr_1.2.1                              
#> [115] crayon_1.5.3                             
#> [116] bamsignals_1.43.1                        
#> [117] rlang_1.1.7                              
#> [118] KEGGREST_1.51.1

Walter, Carolin, Daniel Schuetzmann, Frank Rosenbauer, and Martin Dugas. 2019. “Benchmarking of 4C-Seq Pipelines Based on Real and Simulated Data.” Edited by John Hancock. Bioinformatics 35 (23): 4938–45. https://doi.org/10.1093/bioinformatics/btz426.