\name{diseq}
\alias{diseq}
\alias{diseq.ci}
\alias{diseq.inner}
\title{Estimate or Compute Confidence Interval for the Single-Marker Disequilibrium}
\description{
  Estimate or compute confidence interval for single-marker disequilibrium.
}
\usage{
diseq.ci(object, marker, R = 1000, conf = 0.95, correct = TRUE, na.rm =
TRUE, ...)
diseq.inner(object, marker, ...)
}
\arguments{
  \item{object}{ geneSet object}
  \item{marker}{ marker names }
  \item{R}{ Number of bootstrap iterations to use when computing the
    confidence interval. Defaults to 1000.} 
  \item{conf}{ Confidence level to use when computing the confidence
    level for D-hat.  Defaults to 0.95, should be in (0,1).}
  \item{correct}{See details.}
  \item{na.rm}{ logical. Should missing values be removed?}
  \item{\dots}{ optional additional parameters passed}
}
\details{
  For a single-gene marker, \code{diseq} computes the Hardy-Weinberg
  (dis)equilibrium statistic D, D', r (the correlation coefficient), and
  \eqn{r^2} for each pair of allele values, as well as an overall
  summary value for each measure across all alleles.  \code{print.diseq}
  displays the contents of a \code{diseq} object. \code{diseq.ci}
  computes a bootstrap confidence interval for this estimate.

  For consistency, I have applied the standard definitions for D, D',
  and r from the Linkage Disequilibrium case, replacing all marker 
  probabilities with the appropriate allele probabilities.
  
  Thus, for each allele pair,
  \itemize{
    \item{D}{ is defined as the half of the raw difference
      in frequency between
      the observed number of heterozygotes and the expected number:
      
      \deqn{%
 	D = \frac{1}{2} ( p_{ij} + p_{ji} ) - p_i p_j  %
      }{%
 	D = 1/2 * ( p(ij) + p(ji) ) - p(i)*p(j)   %
      }
      
    }
    \item{D'}{ rescales D to span the range [-1,1] 
      
      
      \deqn{D' = \frac{D}{D_{max} } }{D' = D / Dmax}
      
      where, if D > 0:
      \deqn{%
  	D_{max} = \min{ p_i p_j, p_j p_i } = p_i p_j   %
      }{%
  	Dmax = min(p(i)p(j), p(j)p(i)) =  p(i)p(j)   %
      } 
      or if D < 0:
      \deqn{%
  	D_{max} = \min{ p_i (1 - p_j), p_j (1 - p_i) } %
      }{%
  	Dmax = min( p(i) * (1 - p(j)), p(j)( 1 - (1-p(i) ) ) )
      }
    }
    
    \item{r}{ is the correlation coefficient between two alleles,
      %ignoring all other alleles,
      and can be computed by
      
      \deqn{%
 	r = \frac{-D}{\sqrt( p_i * (1-p_i) p(j) (1-p_j ) )} %
      }{%
 	r = -D / sqrt( p(i)*(1-p(i)) * p(j)*(1-p(j)) ) %
      }
    }
    
  }
  
  where
  \itemize{
    \item{-}{ \eqn{p_i}{p(i)} defined as the observed probability of
      allele 'i', }
    \item{-}{\eqn{p_j}{p(j)} defined as the observed probability of
      allele 'j', and }
    \item{-}{\eqn{p_{ij}}{p(ij)} defined as the observed probability of
      the allele pair 'ij'. }
  }
  
  When there are more than two alleles, the summary values for these
  statistics are obtained by computing a weighted average of the
  absolute value of each allele pair, where the weight is determined by
  the expected frequency. For example:

   \deqn{%
     D_{overall} = \sum_{i \ne j}  |D_{ij}| * p_{ij} % 
   }{%
     D.overall = sum |D(ij)| * p(ij) %
   }

  Bootstrapping is used to generate confidence interval in order to
  avoid reliance on parametric assumptions, which will not hold for
  alleles with low frequencies (e.g. \eqn{D'} following a a Chi-square 
  distribution).  

  See the function \code{HWE} from "genetics" package for testing
  Hardy-Weinberg Equilibrium, \eqn{D=0}.
  
}
\author{Gregory R. Warnes \email{warnes@bst.rochester.edu} and Nitin Jain \email{nitin.jain@pfizer.com}}
\examples{
}
\keyword{ misc}