\name{GOHyperG} \alias{GOHyperG} \title{(DEPRECATED) Hypergeometric Tests for GO} \description{ Use \code{hyperGTest} instead. Given a set of unique Entrez Gene Identifiers, a microarray annotation data package name, and the GO category of interest, this function will compute Hypergeomtric p-values for overrepresentation of each GO term in the specified category among the GO annotations for the interesting genes (as indicated by the Entrez Gene ids). } \usage{ GOHyperG(x, lib, what="MF", universe=NULL) } \arguments{ \item{x}{A character vector of unique Entrez Gene identifiers. } \item{lib}{The name of the annotation data package for the chip that was used or \code{"YEAST"}, see details for more information.} \item{what}{One of "MF", "BP", or "CC" indicating which of the GO categories to use for the computation. In \code{GOKEGGHyperG}, what can also be "KEGG"} \item{universe}{A character vector of unique Entrez Gene identifiers or \code{NULL}. This is the population (the urn) of the Hypergeometric test. When \code{NULL} (default), the population is all Entrez Gene ids in the annotation package that have a GO term annotation in the specified GO category (see details).} } \details{ The Entrez Gene ids given in \code{x} define the selected set of genes. The universe of Entrez Gene ids is determined by the chip annotation data package (\code{lib}) or specified by the \code{universe} argument which must be a subset of the Entrez Gene ids represented on the chip. Both the selected genes and the universe are reduced by removing Entrez Gene ids that do not have any annotations in the specified GO category. For each GO term in the specified category that has at least one annotation in the selected gene set (\code{x}), we determine how many of its Entrez Gene annotations are in the universe set and how many are in the selected set. With these counts we perform a Hypergeometric test using \code{phyper}. This is equivalent to using Fisher's exact test. It is important that the correct chip annotation data package be identified as it determines the GO term to Entrez Gene id mapping as well as the universe of Entrez Gene ids in the case that the \code{universe} argument is omitted. For S. cerevisiae if the \code{lib} argument is set to \code{"YEAST"} then comparisons and statistics are computed using common names and are with respect to all genes annotated in the S. cerevisiae genome not with respect to any microarray chip. This will \bold{not} be the right thing to do if you are working with a yeast microarray. } \note{ Typically, one has a set of interesting genes/probes obtained from a microarray experiment and is interested in determining whether there is an overrepresentation of these genes at particular GO terms. \code{GOHyperG} carries out simple Hypergeometric tests to assess the overrepresentation of GO terms. Two substantial issues arise. First, it is not clear how to do any form of p-value correction. The tests are not independent and the underlying structure of the GO graph presents certain problems that need to be addressed. The second substantial issue is that not all probes on a microarray map to a unique Entrez Gene identifer. In \code{GOHyperG} every attempt to appropriately correct for non-uniqueness of mappings has been made. } \value{ The returned value is a list with components: \item{pvalues}{The ordered p-values.} \item{goCounts}{The vector of counts of Entrez Gene ids from the universe at each node.} \item{intCounts}{The vector of counts of the supplied Entrez Gene ids annotated at each GO term.} \item{numLL}{The number of unique Entrez Gene ids in the universe that are mapped to some term in the specified GO category.} \item{numInt}{The number of unique Entrez Gene ids in the selected gene set, \code{x}, that are mapped to some term in the specified GO category.} \item{chip}{A string identifying the chip annotation data package used.} \item{intLLs}{The input vector \code{x}.} \item{go2Affy}{A list with one element for each GO term tested, containing the Affymetrix identifiers associated with that node, for the whole chip (not just the interesting genes). This is the same as extracting the tested GO ids from the annotation package's GO2ALLPROBES environment.} } \author{R. Gentleman} \seealso{ \code{\link{hyperGTest}}, \code{\link[Category]{geneKeggHyperGeoTest}}, %-does not exist \code{\link[Category]{geneCategoryHyperGeoTest}} \code{\link[stats:Hypergeometric]{phyper}} } \examples{ \dontrun{ library("hgu95av2.db") library("GO.db") w1<-as.list(hgu95av2ENTREZID) w2<-unique(unlist(w1)) set.seed(123) ## pick a 25 interesting genes myLL <- sample(w2, 25) xx <- GOHyperG(myLL, lib="hgu95av2.db", what="CC") xx$numLL xx$numInt sum(xx$pvalues < 0.01) } } \keyword{htest}