population bottleneck

Wilson BA, Petrov DA & Messer PW 2014 Soft selective sweeps in complex demographic scenarios. Genetics 198:669-684.

  • population bottlenecks can lead to the removal of all but one adaptive lineage from an initially soft selective sweep
  • "hardening” of soft selective sweeps
  • even in the global human population, adaptation has produced soft selective sweeps, as evidenced by the parallel evolution of lactase persistence in Eurasia and Africa through recurrent mutations in the lactase enhancer
  • some of these sweeps arose from standing genetic variation while others involved recurrent de novo mutation
  • we focus on the latter scenario of adaptation arising from de novo mutation
  • our current understanding of the likelihood of soft sweeps relies on the assumption of a Wright-Fisher model with fixed population size, where Θ remains constant over time
  • this assumption is clearly violated in many species
  • strong selection is often more likely to produce soft sweeps than weak selection when population size fluctuates
  • population bottlenecks were simulated through a single-generation downsampling to size N2 (without selection) every ΔT generations
  • empirical probabilities of observing a soft sweep in a given simulation run were obtained by calculating the expected probability that two randomly drawn adaptive lineages are not identical by decent, based on the population frequencies of all adaptive lineages in the population at the time of sampling
  • we ignore back mutations and consider the dynamics of the two alleles at this locus in isolation
  • there is no interaction with other alleles elsewhere in the genome
  • the distinction between a hard and a soft sweep is based on the genealogy of adaptive alleles in a population sample
  • it is therefore possible that the same adaptive event yields a soft sweep in one sample but remains hard in another, depending on which individuals are sampled
  • Θ = 2NUA is the population-scale mutation rate—twice the number of adaptive alleles that enter the population per generation
  • the probability of a soft sweep is primarily determined by Θ and is nearly independent of the strength of selection
  • when Θ ≪ 1, adaptive mutations are not readily available in the population and adaptation is impeded by the waiting time until the first successful adaptive mutation arises
  • this regime is referred to as the mutation-limited regime
  • adaptation from de novo mutation typically produces hard sweeps in this case
  • when Θ ≥ 1, by contrast, adaptive mutations arise at least once per generation on average
  • in this non-mutation-limited regime, soft sweeps predominate
  • mutation and selection operate only during the phases when the population is large
  • the two alleles, a and A, are neutral with respect to each other and no new mutations occur during a bottleneck
  • this assumption is justified for severe bottlenecks with N2N1 and when bottlenecks are neutral demographic events
  • many effects of a population bottleneck depend primarily on the ratio of its duration over its severity
  • most of the results we derive below should therefore be readily applicable to more complex bottleneck scenarios by mapping the real bottleneck onto an effective single-generation bottleneck, provided that the real bottleneck is not long enough that beneficial mutations appear during the bottleneck
  • adaptive mutations establish during the large phases at an approximate rate Θs
  • successfully establishing mutations reach their establishment frequency fast compared to the timescale ΔT between bottlenecks
  • establishment can be effectively modeled by a Poisson process
  • this assumption is reasonable when selection is strong and the establishment frequency low
  • those adaptive mutations that do reach establishment frequency typically achieve this quickly in ~γ/s generations, where γ ≈ 0.577 is the Euler–Mascheroni constant
  • τ'est = 1 / (Θs) + log(N1s / N2) / s ... (2)
  • hardening should occur whenever Θ ≥ 1 and at the same time ΔT < τ'est ... (3)
  • in the scenario where Θ = 2, N1/N2 = 104, and ΔT = 100 generations, an adaptive allele with s = 0.056 almost always (90%) produced a hard sweep in our simulations, whereas an allele with s = 0.1 mostly (57%) produced a soft sweep
  • even in the same demographic scenario, the probability of observing soft sweeps can differ substantially for weakly and strongly selected alleles
  • the stronger the selective sweep, the higher the chance that it will be soft in a population that fluctuates in size
  • Otto and Whitlock (1997) showed that the fixation process of an adaptive allele depends on the timescale of the fixation itself
  • only short-term demographic changes encountered during the fixation event matter for strongly selected alleles
  • slower changes affect only weakly selected alleles
  • Otto and Whitlock (1997) therefore concluded that “there is no single effective population size that can be used to determine the probability of fixation for all new beneficial mutations in a population of changing size” (p. 728)
  • our own species has likely experienced population-size changes over more than three orders of magnitude within the past 1000 generations (Gazave et al. 2014)
  • the fitness advantage during the evolution of drug resistance in pathogens or pesticide resistance in insects can be on the order of 10% or larger
  • let us consider another example, motivated by the proposed recent demographic history of the European human population (Coventry et al. 2010; Nelson et al. 2012; Tennessen et al. 2012; Gazave et al. 2014)
  • we assume demographic parameters similar to those estimated by Gazave et al. (2014), i.e., an ancestral population size of Nanc = 104, followed by exponential growth over a period of 113 generations, reaching a current size of Ncur ≈ 520,000 individuals
  • we further assume that exponential growth halts at present and that population size remains constant thereafter
  • this scenario is qualitatively different from the previously discussed models in that population-size changes are nonrecurring
  • for determining whether a given selective sweep will likely be hard or soft in this model, its starting time becomes of crucial importance
  • we assume an adaptive mutation rate of UA = 5 × 10−7 for this example to illustrate the transition between mutation-limited behavior in the ancestral population, where Θanc = 4NancUA ≈ 0.02, and non-mutation-limited behavior in the current population, Θcur = 4NcurUA ≈ 1.0
  • this adaptive mutation rate is higher than the single nucleotide mutation rate in humans
  • it may be appropriate for describing adaptations that have larger mutational target size, such as loss-of-function mutations or changes in the expression level of a gene
  • all sweeps that start prior to the expansion are hard in a sample of size 10, as expected for adaptation by de novo mutation in a mutation-limited scenario
  • sweeps starting in the current, non-mutation-limited regime are almost entirely soft, regardless of the strength of selection
  • sweeps starting during the expansion phase show an interesting crossover behavior between hard and soft sweeps
  • the strength of selection becomes important in this case
  • sweeps that start during the expansion have a higher probability of producing soft sweeps when they are driven by weaker selection than when they are driven by stronger selection
  • in a growing population, a weaker sweep will experience larger population sizes during its course than a stronger sweep starting at the same time, increasing its probability of becoming soft
  • the effective Θ determining the probability of soft sweeps is not the same for different loci across the genome
  • mutational target sizes and huts adaptive mutation rates vary at different loci
  • no single value of Θ will be appropriate for describing the entire adaptive dynamics of a population
  • estimators based on the levels of neutral diversity in a population, such as Θπ and Watterson's ΘW (Ewens 2004), can be strongly biased downward by ancient bottlenecks and recurrent linked selection
  • if adaptation is limited by mutational input, then most adaptive mutations should arise during the population booms, biasing us toward seeing more soft sweeps
  • it is also possible—maybe even more probable—that adaptation will be common during periods of population decline
  • if adaptation is more common during population busts, this should lead us to observe more hard sweeps
  • we have considered only scenarios in which population size and selection coefficients are independent of each other
  • models that consider population size and fitness in a unified framework will be necessary to fully understand signatures that adaptation leaves in populations of variable size