fixation time

Charlesworth B 2020 How long does it take to fix a favorable mutation, and why should we care? Am Nat 195:753-771.

  • R. A. Fisher independently described a deterministic model of selection at an autosomal locus with discrete generations in an article that is the ultimate source of modern population genetics
  • he provided an exact formula for the allele frequency of a favorable allele after n generations of selection, when the fitness of the heterozygote at a biallelic locus is the geometric mean of the fitnesses of the two homozygotes
  • Fisher also introduced the method of branching processes for studying the probability of survival of a new favorable mutation in a very large population
  • for the diffusion equation approach, Fisher assumed binomial sampling from the parental generation to generate a sample of 2N independent alleles in the offspring generation
  • this is now commonly referred to as the Wright-Fisher model, for reasons that are somewhat obscure
  • Fisher proposed the model in 1922 and Sewall Wright initiated his own work on the problem only after meeting Fisher in 1924 (Provine 1986, p.239)
  • according to Provine (1986, p. 237), Wright had independently derived many of Haldane's results in the manuscript that became his classic article "Evolution in Mendelian Populations" (Wright 1931) but omitted this material when he learned of Haldane's work
  • after corresponding with Wright, Fisher (1930) corrected an error in the use of the diffusion equation in his 1922 article, which arose from his assumption that the expected change under drift was zero for his arc cosine transformation of the allele frequency
  • he used the allele frequency change expression for a semidominant mutation under weak selection to obtain the well-known formula for the fixation probability of a new favorable or deleterious semidominant mutation in a finite population (Fisher 1930, pp. 215-216), as well as the probability distribution of allele frequencies under irreversible mutation
  • Wright (1931) used a somewhat less elegant approach to obtain the same results
  • A2 behaves at first like a neutral variant with initial frequency q0 (assumed to be close to zero)
  • its expected frequency at the end of S1 (including cases when it is lost from the population) is q0
  • its probability of surviving this phase is Q1
  • its conditional expected frequency becomes q0 / Q1, which can be equated to q1
  • use of Haldane's (1927b) expression for the fixation probability under a Poisson distribution of offspring gives Q1 = s
  • q1 = 1 / (2Ns) in a Wright-Fisher population
  • the mean time to fixation of a favorable mutation (Tf) is not very sensitive to its level of dominance for a given value of the scaled selection coefficient γ = 2Nes, provided γ ≫ 1
  • a recessive autosomal mutation in a randomly mating population takes much longer than a semidominant mutation to spread to intermediate frequencies from a low initial frequency
  • a dominant mutation takes much longer to approach fixation from a high frequency
  • the difference between the two classes of results arises from the important role of the stochastic phases in controlling the time taken for the initial spread of a new favorable mutation and the time taken for it to become fixed once it has reached a frequency close to 1, as was first pointed out by Ewens (1968, p.62) in the context of the fixation of a dominant mutation
  • these phases cover the extreme points of allele frequencies, which is where the effects of dominance are most marked and where the durations of S1 and S2 (T1 and T2) are affected in nearly opposite ways by the level of dominance
  • this buffers the effect of the value of h on the net time to fixation, which decreases as γ becomes larger
  • T1 is always somewhat less than T2
  • favorable completely or partially dominant or partially dominant alleles [?] always take longer on average to fix than completely or partially recessive alleles, as was noted previously by Teshima and Przeworski (2006) and Ewing et al. (2011)
  • the former are more heavily affected by the duration of S2 and the latter by the duration of S1
  • this does not contradict the often-quoted "Haldane’s sieve" in favor of dominant alleles over recessive alleles in randomly mating populations
  • Haldane's sieve reflects the higher probabilities of fixation of favorable alleles with higher levels of dominance, not their times to fixation conditional on fixation
  • the fixation time itself shows considerable stochastic variation, even in large populations
  • the frequently made statement that the effects of drift relative to selection become insignificant when γ is ≫1 (e.g., p. 240 of Charlesworth and Charlesworth 2010) is not completely accurate when applied to fixation time
  • to a high level of accuracy, little sensitivity of Tf to N was detected
  • the only sizeable deviations from the predictions of the approximate formulas for Tf were with γ = 2,500 and N = 1,000, for which s = 1.25, corresponding to a relative fitness 2.25 of homozygotes for the favorable mutation relative to wild-type homozygote fitness
  • the results derived here are therefore insensitive to population size when selection is sufficiently weak (s ≈ 0.2), vindicating the use of the diffusion equations on which they rely, even when N is as small as 1,000
  • this raises the question of whether serious inaccuracies are introduced by considering only the deterministic phase when calculating the effects of a sweep with partial or complete dominance or recessivity
  • except with very high levels of dominance or recessivity in combination with random mating, γ values that are greater than 80 should guarantee that most of the time that is relevant to recombination is contributed by the deterministic phase
  • except for relatively small γ values, which are expected to leave relatively small signatures of sweeps (other than at very closely linked sites), the use of the deterministic phase alone should yield accurate results
  • under random mating, however, the expected time during the deterministic phase with h < 0.5 that is spent with q < 0.5 is greater than the time spent with q > 0.5, and vice versa with h > 0.5
  • partially recessive autosomal mutations or partially recessive X-linked mutations that are not male limited in their fitness effects should cause smaller overall sweep effects on variability than semidominant or partially dominant mutations despite their slightly shorter times to fixation, as was found to be the case by Teshima and Przeworski (2006), Ewing et al. (2011), and Hartfield and Bataillon (2020) for autosomal loci
  • sweep duration is thus not the only determinant of variability
  • one final question that arises from the results described here is whether fixations of deleterious variants could contribute significantly to the observed signatures of selective sweeps
  • as has been known since the work of Maruyama and Kimura (1974) and is illustrated in figure 4, the time course of a deleterious mutation that is destined to be fixed by drift can be very similar to that of a favorable mutation subject to the same intensity of selection
  • as pointed out by Gillespie (1994), under a model of constant selection on alternative variants at a site, one favored and the other disfavored by selection, and with reversible mutation between them, the equilibrium state is an equal number of substitutions in the direction of good to bad as from bad to good, just as under the standard Li-Bulmer model of selection on codon usage
  • the question of whether substitutions of deleterious mutations can significantly affect patterns of variability at closely linked sites needs investigation