\name{Agi4x44PreProcess-package} \alias{Agi4x44PreProcess-package} \alias{Agi4x44PreProcess} \docType{package} \title{ PreProcessing of Agilent 4x44 array data } \description{ Agi4x44PreProcess Package Overview } \details{ The package allows the preprocessing of Agilent 4x44 array data produced by the Agilent Feature Extraction (AFE) image analysis software. The AFE extracts foreground and background signals, as well as some quality flags. All the extracted information is assembled into the componenents of a 'RGList' object (see 'limma' package) The preprocessing includes: background correction, normalization and filtering probes according to different quality flags that are produced by the AFE. A 'target' file and the corresponding data files produced by the AFE image analysis software are required as inputs. The preprocessing steps are the following: - reading the targets file - reading the array data samples obtained with AFE - Background correction - Normalization between samples - Filtering probes by their Quality Flag - Summarizing replicated probes - Creating and ExpressionSet object with the processed data The package also contains two specific functions that allow the users to explore the architecture of the chip in terms of probe replication and gene replication. In the first case, it identifies non-control replicated probes (Probe Sets) that are spread over the chip with the propouse of evaluating its reproducibility. In the second case, it picks those genes (according to the ACCNUM code obtained from the corresponding Bioconductor annotation package) that are interrogated by different probes in different locations. These groups of genes are termed 'Gene Sets' . The package also contains standard graphical microarray utilities that allow the users to evaluate the quality of the data. These graphics also allow to make a decision about what sort of foreground and background signals, among those provided by the AFE, are going to be used in the analysis. A graphical inspection of the data also might help to dedice what background signal correction and normalization between samples could be more suitable to perform. There are also utility functions that write files across different stages of the processing protocol. These files include the probes list, with information such as their quality flag, normalized intensity and the corresponding information obtained from its annotation package. } \author{ Pedro Lopez-Romero \email{plopez@cnic.es} } \references{ Agilent Feature Extraction Reference Guide \url{http://www.Agilent.com} Gordon K. Smyth, M. Ritchie, N. Thorne, J. Wettenhall (2007). limma: Linear Models for Microarray Data User's Guide. Bolstad, B. M. (2001), Probe level quantile normalization of high density oligonucleotide array data. Unpublished Manuscript: \url{http://bmbolstad.com/stuff/qnorm.pdf} Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P. (2003), A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19, 185-193. Smyth, G. K. (2005). Limma: linear models for microarray data. In: 'Bioinformatics and Computational Biology Solutions Using R and Bioconductor'. R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds), Springer, New York, pages 397 - 420 } \examples{ \dontrun{ reading target file and Agilent Feature Extraction data files targets=read.targets(infile="targets.txt") dd=read.AgilentFE(targets,makePLOT=TRUE) } \dontrun{ data(dd) data(targets) } \dontrun{Non-Control replicated Probes} \dontrun{ CV.rep.probes(dd,"hgug4112a.db", foreground="MeanSignal",raw.data=TRUE,writeR=TRUE,targets) } \dontrun{genes replicated - ensembl } \dontrun{ genes.rpt.agi(dd,annotation.package="hgug4112a.db",raw.data=TRUE, WRITE.html=TRUE,REPORT=TRUE) } \dontrun{NORMALIZATION (here the foreground and background are chosen)} \dontrun{ ddNORM=BGandNorm(dd,BGmethod='half',NORMmethod='quantile', foreground='MeanSignal',background='BGMedianSignal', offset=50,makePLOTpre=TRUE,makePLOTpost=TRUE) } \dontrun{FILTERING PROBES} \dontrun{ ddFILT=filter.probes(ddNORM, control=TRUE, wellaboveBG=TRUE, isfound=TRUE, wellaboveNEG=TRUE, sat=TRUE, PopnOL=TRUE, NonUnifOL=TRUE, nas=TRUE, limWellAbove=75, limISF=75, limNEG=75, limSAT=75, limPopnOL=75, limNonUnifOL=75, limNAS=100, makePLOT=TRUE,annotation.package="hgug4112a.db",flag.counts=TRUE,targets) } \dontrun{SUMMARIZING PROBES} \dontrun{ ddPROC=summarize.probe(ddFILT,makePLOT=TRUE,targets) } \dontrun{CREATING EXPRESIONSET OBJECT} \dontrun{ esetPROC=build.eset(ddPROC,targets,makePLOT=TRUE, annotation.package="hgug4112a.db") dim(esetPROC) } \dontrun{WRITING EXPRESIONSET OBJECT: ProcessedData.txt} \dontrun{ write.eset(esetPROC,ddPROC,"hgug4112a.db",targets) } \dontrun{ MAPPING VARIABLE} \dontrun{ mappings=build.mappings(esetPROC,annotation.package="hgug4112a.db") names(mappings) } \dontrun{Gene Set Enrichment Analysis at: http://www.broad.mit.edu/gsea} \dontrun{ gsea.files(esetPROC,targets,annotation.package="hgug4112a.db") } } \keyword{package}