population bottleneck

Pavlidis P, Jensen JD & Stephan W 2010 Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations. Genetics 185:907-922.

  • in natural populations, positive selection may occur simultaneously with demographic changes
  • selective sweeps in nonequilibrium populations may result in a loss of high-frequency-derived variants and violate the assumptions of SweepFinder and the ω-statistic
  • the combined action of selective sweeps and bottlenecks results in SFS that differ considerably from those generated by selective sweeps in equilibrium populations
  • the joint effect of selection and population contraction increases the probability of coalescences
  • the most recent common ancestor is located within the bottleneck phase
  • the frequency of the n − 1 class vanishes in the present-day sample
  • the part of the genealogy that is older than the selective sweep/bottleneck phase is eliminated
  • the vast majority of the present-day polymorphisms are younger than the selective sweep
  • SHH vs. neutrality in nonequilibrium populations
  • we focus on two bottleneck scenarios
  • the first one describes a deep and short-lasting bottleneck (model A)
  • the second scenario describes a shallow and long-lasting bottleneck (model B)
  • in both cases the severity (i.e., the product depth × length) is the same (=0.375 in units of 4N)
  • in the deep bottleneck scenario, the depth present population size / bottlenecked population size = 500 and the length 0.00075
  • in the shallow bottleneck scenario, the depth equals 20 and the length 0.01875
  • we fix the number of polymorphic sites (=50) by employing broad uniform priors on θ and accepting only those instances that result in 50 segregating sites
  • when the sweep is either recent or old, the discrimination between neutral and selective models becomes problematic
  • recent or old selection in populations that have experienced deep bottlenecks cannot be discriminated from neutrality
  • when selection has occurred within the bottleneck phase, the false positive rate decreases to 20% and the true positive rate is 73% for the SVM and about 10% lower for the SweepFinder
  • higher discrimination performance is achieved when the sweep completes within the bottleneck
  • this requires unrealistically high values of s
  • in model B (shallow bottleneck), the discrimination performance is slightly better than that of model A
  • again the most challenging scenarios are either recent or old sweeps
  • the performance increases when the sweep occurs within the bottleneck phase
  • distinguishing RHH from neutrality in equilibrium populations
  • recurrent selected substitutions occur randomly along a chromosome according to a time-homogeneous Poisson process at a rate v per generation
  • well-known patterns of SHH models are modified under RHH
  • the SFS is skewed toward the rare variants
  • the excess of high-frequency-derived alleles decreases
  • Jensen et al. (2007) have shown that it is difficult to separate RHH models from neutrality on the basis of ωMAX-values or site frequency spectrum statistics
  • the ω-statistic and SweepFinder are based on the assumption that a single selective sweep has just been completed
  • overlapping selective sweeps
  • the RHH model we have used describes successive and nonoverlapping selective events
  • a most extreme scenario, which describes the appearance of beneficial mutations at the same site, is described as "soft" sweep
  • soft sweeps may emerge during the evolution of organisms (e.g., Plasmodium) with high mutation rates
  • conversely, they may be of limited importance in the evolution of D. melanogaster or Homo sapiens, for instance
  • SFS-based approaches may not work under overlapping selective sweeps
  • the frequency of the class of polymorphisms in intermediate frequency may be quite large
  • LD-based statistics can be useful
  • a multitude of extended haplotypes may exist on the left and right sides of the selected region