1 Introduction

This vignette demonstrates the local, interactive use of dnaEPICO. In this mode, the main functions return structured in-memory result objects that can be inspected directly in the R session. This is the recommended route when you are checking inputs, reviewing quality-control outputs, or learning the structure of the package results before moving to larger analyses.

The examples below focus on the first three workflow stages:

  1. preprocessingMinfiEwasWater()
  2. svaEnmix()
  3. preprocessingPheno()

All examples use saveOutputs = FALSE, which keeps the workflow in memory and avoids writing the legacy file outputs.

2 Citation

The dnaEPICO package uses methods from several external packages. Because no single manuscript describes all components, the guidance below explains how to cite dnaEPICO depending on the functions you use.

This citation guidance is adapted from the vignettes and user guides of minfi, ENmix, and wateRmelon.

  • If you use preprocessingMinfiEwasWater(), please cite Aryee et al. (2014); Fortin, Triche Jr., and Hansen (2017); Xu, Niu, and Taylor (2021); Pidsley et al. (2013); Murat et al. (2020); Maksimovic, Gordon, and Oshlack (2012); Fortin et al. (2014); Triche et al. (2013); Touleimat and Tost (2012).
  • If you use svaEnmix(), please cite Aryee et al. (2014); Xu, Niu, and Taylor (2021).

3 Required knowledge

dnaEPICO is built on core Bioconductor infrastructure for high-dimensional genomic data, with a focus on Illumina DNA methylation arrays. To read this vignette, we assume that you are already familiar with the general DNA methylation workflow. If not, we recommend first reading this tutorial, which provides a practical introduction to the main concepts and analysis steps.

Preprocessing and quality control are performed using established Bioconductor tools, including minfi, ENmix, and wateRmelon. Users are expected to have basic familiarity with R, Bioconductor workflows, and Illumina IDAT file structures.

If you are asking yourself the question “Where do I start using Bioconductor?”, you might be interested in this blog post.

4 Installation

The dnaEPICO package depends on minfi, ENmix, and wateRmelon, among others. Install dependencies with BiocManager:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}

BiocManager::install("dnaEPICO")

BiocManager::valid()
library("dnaEPICO")

5 Step 1: preprocessingMinfiEwasWater

The preprocessing wrapper reads a phenotype table and IDAT files, performs quality control, normalisation, filtering, and cell-composition estimation, and returns a list-based result object. The next chunk runs the function on a small example input and prints the class, element names, and a preview of the filtered phenotype table.

5.0.1 Create example input files

The next chunk shows how to recreate the temporary files used in this example. It writes a small phenotype table to a temporary directory, copies a small set of IDAT files, and records the cross-reactive probe reference used during probe filtering.

preprocessing_inputs <- dnaEPICO:::exampleMinfiIdatInputsDnaEpico(n = 6L)

names(preprocessing_inputs)
#> [1] "tempDir"           "idatFolder"        "phenoFile"        
#> [4] "targets"           "arrayType"         "annotationVersion"
#> [7] "crossReactivePath"
preprocessing_inputs$tempDir
#> [1] "/tmp/Rtmp6rWi3o/dnaEPICO-idat-example-3a4cfb42a656eb"
basename(preprocessing_inputs$phenoFile)
#> [1] "pheno.csv"
head(basename(list.files(preprocessing_inputs$idatFolder, full.names = TRUE)), 4)
#> [1] "5723646052_R02C02_Grn.idat" "5723646052_R02C02_Red.idat"
#> [3] "5723646052_R04C01_Grn.idat" "5723646052_R04C01_Red.idat"
basename(preprocessing_inputs$crossReactivePath)
#> [1] "12864_2024_10027_MOESM8_ESM.csv"

preprocessing_inputs is a small list returned by the internal example helper. names(preprocessing_inputs) shows its main elements: tempDir is the temporary working directory, idatFolder contains the copied example IDAT files, phenoFile is the phenotype table used by the wrapper, targets is the same phenotype information already loaded into memory, arrayType and annotationVersion describe the array platform, and crossReactivePath points to the cross-reactive probe reference used later during filtering.

preprocessResult <- preprocessingMinfiEwasWater(
  phenoFile = preprocessing_inputs$phenoFile,
  idatFolder = preprocessing_inputs$idatFolder,
  outputLogs = file.path(preprocessing_inputs$tempDir, "logs"),
  nSamples = 6,
  SampleID = "Sample_Name",
  arrayType = preprocessing_inputs$arrayType,
  annotationVersion = preprocessing_inputs$annotationVersion,
  scriptLabel = "preprocessingMinfiEwasWater",
  baseDataFolder = file.path(preprocessing_inputs$tempDir, "rData"),
  figureBaseDir = file.path(preprocessing_inputs$tempDir, "figures"),
  detPThreshold = 1,
  normMethods = "quantile",
  sexColumn = "Sex",
  pvalThreshold = 1,
  chrToRemove = "",
  snpsToRemove = "SBE",
  mafThreshold = 1,
  crossReactivePath = preprocessing_inputs$crossReactivePath,
  plotGroupVar = "Sex",
  lcRef = "saliva",
  phenoOrder = "Sample_Name;Sex;Basename;Sentrix_ID;Sentrix_Position",
  lcPhenoDir = preprocessing_inputs$tempDir,
  display = FALSE,
  verbose = FALSE,
  logs = FALSE,
  saveOutputs = FALSE
)

class(preprocessResult)
#> [1] "dnaEPICO_preprocessingMinfiEwasWater"
names(preprocessResult)
#>  [1] "targets"     "RGSet"       "rawData"     "assessment"  "sexData"    
#>  [6] "normData"    "filterData"  "metricsData" "lcData"      "logFile"
head(preprocessResult$targets[, c(
  "Sample_Name",
  "Sex",
  "Sentrix_ID",
  "Sentrix_Position"
)])
#>   Sample_Name Sex Sentrix_ID Sentrix_Position
#> 1    GroupA_3   1 5723646052           R02C02
#> 2    GroupA_2   0 5723646052           R04C01
#> 3    GroupB_3   1 5723646052           R05C02
#> 4    GroupB_1   0 5723646053           R04C02
#> 5    GroupA_1   0 5723646053           R05C02
#> 6    GroupB_2   0 5723646053           R06C02

The returned object has class dnaEPICO_preprocessingMinfiEwasWater. names(preprocessResult) lists the main elements returned by the wrapper. The targets element is the filtered phenotype table aligned to the retained samples. The remaining elements are nested result objects for the major preprocessing stages, including the RGSet, the methylation metrics, and the cell-composition results.

Arguments:

  • phenoFile and idatFolder define the phenotype table and the directory of raw IDAT pairs.
  • nSamples = 6 keeps the example small enough for interactive use.
  • SampleID = "Sample_Name" identifies the column used to align phenotype data and methylation objects.
  • arrayType and annotationVersion must match the array platform being processed.
  • detPThreshold, normMethods, pvalThreshold, chrToRemove, snpsToRemove, and mafThreshold control quality filtering and probe filtering.
  • crossReactivePath supplies the cross-reactive probe reference used during filtering.
  • sexColumn, plotGroupVar, lcRef, and phenoOrder control sex prediction, quality-control grouping, cell-type estimation, and the structure of the returned phenotype table.
  • display = FALSE, verbose = FALSE, logs = FALSE, and saveOutputs = FALSE keep the example quiet, in memory, and free of file outputs.

The next chunk inspects three important elements: the filtered RGSet, the metric matrices, and the phenotype table augmented with cell-composition estimates.

class(preprocessResult$RGSet)
#> [1] "RGChannelSet"
#> attr(,"package")
#> [1] "minfi"
names(preprocessResult$metricsData)
#> [1] "beta" "m"    "cn"
head(preprocessResult$lcData$phenoLC[, 1:6])
#>   Sample_Name Sex          Basename Sentrix_ID Sentrix_Position Sample_Well
#> 1    GroupA_3   1 5723646052_R02C02 5723646052           R02C02          H5
#> 2    GroupA_2   0 5723646052_R04C01 5723646052           R04C01          D5
#> 3    GroupB_3   1 5723646052_R05C02 5723646052           R05C02          C6
#> 4    GroupB_1   0 5723646053_R04C02 5723646053           R04C02          F7
#> 5    GroupA_1   0 5723646053_R05C02 5723646053           R05C02          G7
#> 6    GroupB_2   0 5723646053_R06C02 5723646053           R06C02          H7

In local use, these are the most useful objects to inspect first. preprocessResult$RGSet is the core methylation object. metricsData contains the beta-value, M-value, and copy-number matrices used in downstream steps. lcData$phenoLC is the phenotype table augmented with estimated cell composition and is the object passed forward to later workflow stages.

6 Step 2: svaEnmix

The SVA wrapper loads a phenotype table and a saved RGSet, estimates surrogate variables from ENmix control probes, and returns both the surrogate variables and the merged phenotype table.

6.0.1 Create example input files

The next chunk shows how to recreate the temporary files consumed by svaEnmix(): a phenotype CSV file and a saved RGSet object in .RData format.

sva_inputs <- dnaEPICO:::exampleMinfiBaseDataDnaEpico()
sva_temp_dir <- tempdir()
sva_targets_file <- file.path(sva_temp_dir, "sva_targets.csv")
sva_rgset_file <- file.path(sva_temp_dir, "sva_RGSet.RData")

names(sva_inputs)
#> [1] "RGSet"             "targets"           "crossReactivePath"
class(sva_inputs$RGSet)
#> [1] "RGChannelSet"
#> attr(,"package")
#> [1] "minfi"
utils::write.csv(sva_inputs$targets, sva_targets_file, row.names = FALSE)
RGSet <- sva_inputs$RGSet
save(RGSet, file = sva_rgset_file)

basename(c(sva_targets_file, sva_rgset_file))
#> [1] "sva_targets.csv" "sva_RGSet.RData"
file.exists(c(sva_targets_file, sva_rgset_file))
#> [1] TRUE TRUE

sva_inputs is another helper list used to make the vignette self-contained. names(sva_inputs) shows that it provides the in-memory RGSet, the matching targets table, and a crossReactivePath reference. In this step, the wrapper needs file inputs rather than the in-memory objects, so the chunk writes targets to sva_targets.csv and saves the RGSet object to sva_RGSet.RData.

svaResult <- svaEnmix(
  phenoFile = sva_targets_file,
  rgsetData = sva_rgset_file,
  outputLogs = file.path(tempdir(), "logs"),
  SampleID = "Sample_Name",
  arrayType = "IlluminaHumanMethylation450k",
  annotationVersion = "ilmn12.hg19",
  SentrixIDColumn = "Sentrix_ID",
  SentrixPositionColumn = "Sentrix_Position",
  ctrlSvaPercVar = 0.90,
  ctrlSvaFlag = 1,
  scriptLabel = "svaEnmix",
  display = FALSE,
  verbose = FALSE,
  logs = FALSE,
  saveOutputs = FALSE
)
#> 3  surrogate variables explain  91.17398 % of 
#>     data variation

class(svaResult)
#> [1] "dnaEPICO_svaEnmix"
names(svaResult)
#> [1] "targets"      "RGSet"        "svaData"      "mergedPheno"  "analysisData"
#> [6] "plotFiles"    "savedFiles"   "logFile"
dim(svaResult$svaData$sva)
#> [1] 6 3
head(svaResult$mergedPheno[, 1:min(6, ncol(svaResult$mergedPheno))])
#>   Sample_Name Sample_Well Sample_Plate Sample_Group Pool_ID person
#> 1    GroupA_3          H5           NA       GroupA      NA    id3
#> 2    GroupA_2          D5           NA       GroupA      NA    id2
#> 3    GroupB_3          C6           NA       GroupB      NA    id3
#> 4    GroupB_1          F7           NA       GroupB      NA    id1
#> 5    GroupA_1          G7           NA       GroupA      NA    id1
#> 6    GroupB_2          H7           NA       GroupB      NA    id2

The returned object has class dnaEPICO_svaEnmix, and names(svaResult) shows the main SVA components returned by the wrapper. The svaData$sva matrix contains one row per sample and one column per surrogate variable. The mergedPheno table is often the most convenient object to inspect because it combines the original phenotype data and the new SVA columns in one place.

Arguments:

  • phenoFile points to the phenotype table that already includes the cell composition estimates from preprocessing.
  • rgsetData points to the saved RGSet used to derive ENmix control-probe surrogate variables.
  • SampleID, SentrixIDColumn, and SentrixPositionColumn identify the sample and array-position columns used in the SVA analysis.
  • ctrlSvaPercVar = 0.90 asks the procedure to keep enough control-derived surrogate variables to explain 90% of the control-probe variance.
  • ctrlSvaFlag = 1 enables the control-based SVA workflow.
  • display = FALSE, verbose = FALSE, logs = FALSE, and saveOutputs = FALSE keep the example interactive and in-memory.

The association analysis between the surrogate variables and array-position metadata is stored separately in analysisData.

class(svaResult$analysisData)
#> [1] "dnaEPICO_svaEnmix_analysis"
names(svaResult$analysisData)
#> [1] "sva"             "K"               "sentrixID"       "sentrixPosition"
#> [5] "fullModels"      "reducedModels"   "droptermSteps"   "anovaFull"      
#> [9] "anovaReduced"

This object contains the fitted models and ANOVA summaries used to assess how the surrogate variables relate to Sentrix chip and position effects.

7 Step 3: preprocessingPheno

The phenotype-preparation wrapper aligns the phenotype table with the beta, M-value, and copy-number matrices; splits the data by timepoint; constructs a combined longitudinal object; and prepares Clock Foundation export tables.

7.0.1 Create example input files

The next chunk shows how to recreate the temporary phenotype and matrix files used in this example. The helper writes a phenotype table plus aligned beta, M-value, and copy-number objects to a temporary directory.

pheno_inputs <- dnaEPICO:::examplePreprocessingPhenoStateDnaEpico()

names(pheno_inputs)
#>  [1] "tempDir"           "pheno"             "phenoPath"        
#>  [4] "betaPath"          "mPath"             "cnPath"           
#>  [7] "metricsData"       "timepointData"     "combinedData"     
#> [10] "clockFoundation"   "preprocessingData"
pheno_inputs$tempDir
#> [1] "/tmp/Rtmp6rWi3o/dnaEPICO-preprocessingPheno-example-3a4cfb6cbe96f0"
basename(c(
  pheno_inputs$phenoPath,
  pheno_inputs$betaPath,
  pheno_inputs$mPath,
  pheno_inputs$cnPath
))
#> [1] "phenoLC.csv" "beta.RData"  "m.RData"     "cn.RData"

pheno_inputs is a list that bundles both the temporary files and the corresponding in-memory objects used in this stage. names(pheno_inputs) shows that it contains the temporary directory, the phenotype table, the three saved matrix files (phenoPath, betaPath, mPath, and cnPath), and precomputed helper objects such as metricsData, timepointData, combinedData, and clockFoundation.

phenoResult <- preprocessingPheno(
  phenoFile = pheno_inputs$phenoPath,
  betaPath = pheno_inputs$betaPath,
  mPath = pheno_inputs$mPath,
  cnPath = pheno_inputs$cnPath,
  SampleID = "Sample_Name",
  timeVar = "Timepoint",
  timepoints = "1,2",
  combineTimepoints = "1,2",
  outputPheno = file.path(pheno_inputs$tempDir, "data", "preprocessingPheno"),
  outputRData = file.path(
    pheno_inputs$tempDir,
    "rData",
    "preprocessingPheno",
    "metrics"
  ),
  outputRDataMerge = file.path(
    pheno_inputs$tempDir,
    "rData",
    "preprocessingPheno",
    "mergeData"
  ),
  sexColumn = "Sex",
  outputLogs = file.path(pheno_inputs$tempDir, "logs"),
  outputDir = file.path(pheno_inputs$tempDir, "clockFoundation"),
  verbose = FALSE,
  logs = FALSE,
  saveOutputs = FALSE
)

class(phenoResult)
#> [1] "dnaEPICO_preprocessingPheno"
names(phenoResult)
#> [1] "pheno"           "metricsData"     "timepointData"   "combinedData"   
#> [5] "clockFoundation" "savedFiles"      "logFile"
names(phenoResult$combinedData)
#> [1] "timepoints" "suffix"     "pheno"      "phenoBeta"
head(phenoResult$combinedData$phenoBeta[, 1:6])
#>   Sample_Name Timepoint    Sex Age cg00000029 cg00000108
#> 1          S1         1 Female  20       0.20       0.60
#> 2          S2         1   Male  22       0.25       0.55
#> 3          S3         2 Female  21       0.22       0.52
#> 4          S4         2   Male  23       0.27       0.58

The returned object has class dnaEPICO_preprocessingPheno, and names(phenoResult) shows the main returned components. The combinedData element groups the combined longitudinal objects, and phenoResult$combinedData$phenoBeta is the key downstream table for modeling. It combines phenotype columns and CpG beta values in a single data frame that can be passed to the GLM or GLMM wrappers.

Arguments:

  • phenoFile, betaPath, mPath, and cnPath define the phenotype table and the three methylation matrices that are aligned in this step.
  • SampleID = "Sample_Name" is the key used to match samples across the loaded objects.
  • timeVar = "Timepoint", timepoints = "1,2", and combineTimepoints = "1,2" define the separate exports and the combined longitudinal object.
  • outputPheno, outputRData, outputRDataMerge, and outputDir still need to be specified even when saveOutputs = FALSE, because they define the canonical output structure for the wrapper.
  • sexColumn = "Sex" identifies the sex variable used by the helper functions.
  • verbose = FALSE, logs = FALSE, and saveOutputs = FALSE keep the example focused on the returned object rather than file writing.

The Clock Foundation export objects are returned in the same result, which makes it possible to inspect the export-ready tables before deciding whether they should be used in a reporting or export workflow.

names(phenoResult$clockFoundation)
#> [1] "betaCSV" "phenoCF"
head(phenoResult$clockFoundation$phenoCF)
#>   id Timepoint    Sex Age
#> 1 S1         1 Female  20
#> 2 S2         1   Male  22
#> 3 S3         2 Female  21
#> 4 S4         2   Male  23

The clockFoundation element contains the export-ready objects, and phenoResult$clockFoundation$phenoCF is the phenotype table prepared for Clock Foundation use.

8 Summary

In interactive local use, the most important habit is to inspect the returned objects after each step. In practice, the following checks are usually enough to confirm that the workflow is behaving as expected:

  • class(result) to confirm which wrapper produced the object
  • names(result) to see the main return elements
  • head() on the phenotype-like tables
  • class() or dim() on the methylation objects and matrices

The next vignette shows how to use saveOutputs = TRUE and the Makefile workflow. The file-based vignette also describes the regression and mixed-effects model structures used downstream, including how phenotype-specific PRS terms are added through prsMap.

9 Basics

Date the vignette was generated.

#> [1] "2026-04-22 21:02:06 EDT"

Wallclock time spent generating the vignette.

#> Time difference of 1.358 mins

R session information.

#> R version 4.6.0 RC (2026-04-17 r89917)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#>  [1] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
#>  [2] IlluminaHumanMethylation450kmanifest_0.4.0        
#>  [3] minfi_1.57.0                                      
#>  [4] bumphunter_1.53.0                                 
#>  [5] locfit_1.5-9.12                                   
#>  [6] iterators_1.0.14                                  
#>  [7] foreach_1.5.2                                     
#>  [8] Biostrings_2.79.5                                 
#>  [9] XVector_0.51.0                                    
#> [10] SummarizedExperiment_1.41.1                       
#> [11] Biobase_2.71.0                                    
#> [12] MatrixGenerics_1.23.0                             
#> [13] matrixStats_1.5.0                                 
#> [14] GenomicRanges_1.63.2                              
#> [15] Seqinfo_1.1.0                                     
#> [16] IRanges_2.45.0                                    
#> [17] S4Vectors_0.49.2                                  
#> [18] BiocGenerics_0.57.1                               
#> [19] generics_0.1.4                                    
#> [20] dnaEPICO_0.99.14                                  
#> [21] BiocStyle_2.39.0                                  
#> 
#> loaded via a namespace (and not attached):
#>   [1] RColorBrewer_1.1-3        jsonlite_2.0.0           
#>   [3] magrittr_2.0.5            GenomicFeatures_1.63.2   
#>   [5] rmarkdown_2.31            BiocIO_1.21.0            
#>   [7] vctrs_0.7.3               multtest_2.67.0          
#>   [9] memoise_2.0.1             Rsamtools_2.27.2         
#>  [11] DelayedMatrixStats_1.33.0 RCurl_1.98-1.18          
#>  [13] ENmix_1.47.3              askpass_1.2.1            
#>  [15] htmltools_0.5.9           S4Arrays_1.11.1          
#>  [17] AnnotationHub_4.1.0       dynamicTreeCut_1.63-1    
#>  [19] curl_7.1.0                Rhdf5lib_1.99.6          
#>  [21] RPMM_1.25                 SparseArray_1.11.13      
#>  [23] rhdf5_2.55.16             sass_0.4.10              
#>  [25] KernSmooth_2.23-26        nor1mix_1.3-3            
#>  [27] bslib_0.10.0              httr2_1.2.2              
#>  [29] plyr_1.8.9                impute_1.85.0            
#>  [31] cachem_1.1.0              GenomicAlignments_1.47.0 
#>  [33] lifecycle_1.0.5           pkgconfig_2.0.3          
#>  [35] Matrix_1.7-5              R6_2.6.1                 
#>  [37] fastmap_1.2.0             digest_0.6.39            
#>  [39] siggenes_1.85.0           reshape_0.8.10           
#>  [41] minfiData_0.57.0          AnnotationDbi_1.73.1     
#>  [43] irlba_2.3.7               ExperimentHub_3.1.0      
#>  [45] geneplotter_1.89.0        RSQLite_2.4.6            
#>  [47] base64_2.0.2              filelock_1.0.3           
#>  [49] httr_1.4.8                abind_1.4-8              
#>  [51] compiler_4.6.0            beanplot_1.3.1           
#>  [53] rngtools_1.5.2            bit64_4.8.0              
#>  [55] doParallel_1.0.17         BiocParallel_1.45.0      
#>  [57] DBI_1.3.0                 gplots_3.3.0             
#>  [59] HDF5Array_1.39.1          MASS_7.3-65              
#>  [61] openssl_2.4.0             rappdirs_0.3.4           
#>  [63] DelayedArray_0.37.1       rjson_0.2.23             
#>  [65] caTools_1.18.3            gtools_3.9.5             
#>  [67] tools_4.6.0               otel_0.2.0               
#>  [69] rentrez_1.2.4             glue_1.8.1               
#>  [71] quadprog_1.5-8            h5mread_1.3.3            
#>  [73] restfulr_0.0.16           nlme_3.1-169             
#>  [75] rhdf5filters_1.23.3       grid_4.6.0               
#>  [77] cluster_2.1.8.2           tzdb_0.5.0               
#>  [79] preprocessCore_1.73.0     tidyr_1.3.2              
#>  [81] data.table_1.18.2.1       hms_1.1.4                
#>  [83] xml2_1.5.2                BiocVersion_3.23.1       
#>  [85] pillar_1.11.1             limma_3.67.3             
#>  [87] genefilter_1.93.0         splines_4.6.0            
#>  [89] dplyr_1.2.1               BiocFileCache_3.1.0      
#>  [91] lattice_0.22-9            survival_3.8-6           
#>  [93] rtracklayer_1.71.3        bit_4.6.0                
#>  [95] GEOquery_2.79.0           annotate_1.89.0          
#>  [97] tidyselect_1.2.1          knitr_1.51               
#>  [99] bookdown_0.46             xfun_0.57                
#> [101] scrime_1.3.7              statmod_1.5.1            
#> [103] yaml_2.3.12               evaluate_1.0.5           
#> [105] codetools_0.2-20          cigarillo_1.1.0          
#> [107] tibble_3.3.1              BiocManager_1.30.27      
#> [109] cli_3.6.6                 xtable_1.8-8             
#> [111] jquerylib_0.1.4           Rcpp_1.1.1-1             
#> [113] dbplyr_2.5.2              png_0.1-9                
#> [115] XML_3.99-0.23             readr_2.2.0              
#> [117] blob_1.3.0                mclust_6.1.2             
#> [119] doRNG_1.8.6.3             sparseMatrixStats_1.23.0 
#> [121] bitops_1.0-9              illuminaio_0.53.0        
#> [123] purrr_1.2.2               crayon_1.5.3             
#> [125] rlang_1.2.0               KEGGREST_1.51.1

9.1 Asking for help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and Bioconductor have a steep learning curve, so it is critical to learn where to ask for help. We would like to highlight the Bioconductor support site as the main resource for getting help. Please remember to use the dnaEPICO tag and check the older posts. If you want to receive help, please provide a small reproducible example and your session information so the source of the problem can be tracked efficiently.

References

Aryee, Martin J, Andrew E Jaffe, Hector Corrada Bravo, Christine Ladd-Acosta, Andrew P Feinberg, Kasper D Hansen, and Rafael A Irizarry. 2014. “Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.” Bioinformatics 30 (10): 1363–9. https://doi.org/10.1093/bioinformatics/btu049.

Fortin, Jean-Philippe, Aurélie Labbe, Mathieu Lemire, Brent W Zanke, Thomas J Hudson, Elana J Frtig, Celia MT Greenwood, and Kasper D Hansen. 2014. “Functional normalization of 450k methylation array data improves replication in large cancer studies.” Genome Biology 15 (11): 503. https://doi.org/10.1186/s13059-014-0503-2.

Fortin, Jean-Philippe, Timothy Triche Jr., and Kasper D Hansen. 2017. “Preprocessing, Normalization and Integration of the Illumina HumanMethylationEPIC Array with Minfi.” Bioinformatics 33 (4): 558–60. https://doi.org/10.1093/bioinformatics/btw691.

Maksimovic, Jovana, Lavinia Gordon, and Alicia Oshlack. 2012. “SWAN: Subset quantile Within-Array Normalization for Illumina Infinium HumanMethylation450 BeadChips.” Genome Biology 13 (6): R44. https://doi.org/10.1186/gb-2012-13-6-r44.

Murat, Kubra, Björn Grüning, Pawel W Poterlowicz, Gareth Westgate, Desmond J Tobin, and Krzysztof Poterlowicz. 2020. “Ewastools: Infinium Human Methylation BeadChip pipeline for population epigenetics integrated into Galaxy.” GigaScience 9 (5): giaa049. https://doi.org/10.1093/gigascience/giaa049.

Pidsley, Ruth, Ching Ching Y Wong, Matteo Volta, Katie Lunnon, Jonathan Mill, and Leonard C Schalkwyk. 2013. “A data-driven approach to preprocessing Illumina 450K methylation array data.” BMC Genomics 14: 293. https://doi.org/10.1186/1471-2164-14-293.

Touleimat, Nizar, and Jörg Tost. 2012. “Complete Pipeline for Infinium() Human Methylation 450K BeadChip Data Processing Using Subset Quantile Normalization for Accurate DNA Methylation Estimation.” Epigenomics 4 (3): 325–41. https://doi.org/10.2217/epi.12.21.

Triche, Timothy J, Daniel J Weisenberger, David Van Den Berg, Peter W Laird, and Kimberly D Siegmund. 2013. “Low-level processing of Illumina Infinium DNA Methylation BeadArrays.” Nucleic Acids Research 41 (7): e90. https://doi.org/10.1093/nar/gkt090.

Xu, Zongli, Li Niu, and Jack A Taylor. 2021. “The ENmix DNA methylation analysis pipeline for Illumina BeadChip and comparisons with seven other preprocessing pipelines.” Clinical Epigenetics 13 (1): 216. https://doi.org/10.1186/s13148-021-01207-1.