coalescent with epistasis

Fearnhead P 2003 Ancestral processes for non-neutral models of complex diseases. Theor Popul Biol 63:115-130.

  • if we assume a multiplicative model of fitnesses at these loci (see Risch, 1990), then there is independence of the genealogies (and also the gene-frequencies) at different loci
  • for linked, non-neutral loci, the genealogical history of these loci is described by a further Coalescent-type process, the ancestral influence graph (AIG) of Donnelly and Kurtz (1999)
  • we consider an extension of the AIG, which we call the complex selection graph (CSG), which describes the genealogical processes at unlinked, non-neutral loci
  • the derivation of the CSG is based on considering the limit as the recombination rate in the AIG tends to infinity, whilst keeping the selection rate fixed
  • the proof that the CSG represents this limiting process is given in the appendix
  • while the gene-interactions do not produce linkage disequilibrium, they can produce large dependencies between the allele frequencies at different loci
  • an AIG (Donnelly and Kurtz, 1999) is a supragenealogy for a sample at linked non-neutral loci
  • it can be viewed as an extension of the ASG (Krone and Neuhauser, 1997; Neuhauser and Krone, 1997) to include recombination
  • or an extension of the ancestral recombination graph (Griffiths and Marjoram, 1996b) to include selection
  • let nk(l) denote the number of chromosomes in the sample that have allele Ak at locus l
  • let n(l) = (n1(l), ..., nK(l))
  • the probability of all ordered samples consistent with (n(1), ..., n(L)) (that is with nk(l) chromosomes which have allele Ak at locus l, for l = 1, ..., L, and k = 1, ..., K) are identical
  • one consequence of this is that there is linkage equilibrium across loci
  • conditional on the allele frequencies (n(1), ..., n(L)), the alleles at different loci on the same chromosome are independent
  • whilst selection can create linkage disequilibrium (Hartl and Clark, 1997), recombination, which acts at a much quicker rate, removes it
  • this does not mean that the loci are independent
  • there is dependence in the gene-frequencies at different loci