population bottleneck
Pavlidis P, Jensen JD & Stephan W 2010 Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations. Genetics 185:907-922.
- in natural populations, positive selection may occur simultaneously with demographic changes
- selective sweeps in nonequilibrium populations may result in a loss of high-frequency-derived variants and violate the assumptions of SweepFinder and the ω-statistic
- the combined action of selective sweeps and bottlenecks results in SFS that differ considerably from those generated by selective sweeps in equilibrium populations
- the joint effect of selection and population contraction increases the probability of coalescences
- the most recent common ancestor is located within the bottleneck phase
- the frequency of the n − 1 class vanishes in the present-day sample
- the part of the genealogy that is older than the selective sweep/bottleneck phase is eliminated
- the vast majority of the present-day polymorphisms are younger than the selective sweep
- SHH vs. neutrality in nonequilibrium populations
- we focus on two bottleneck scenarios
- the first one describes a deep and short-lasting bottleneck (model A)
- the second scenario describes a shallow and long-lasting bottleneck (model B)
- in both cases the severity (i.e., the product depth × length) is the same (=0.375 in units of 4N)
- in the deep bottleneck scenario, the depth present population size / bottlenecked population size = 500 and the length 0.00075
- in the shallow bottleneck scenario, the depth equals 20 and the length 0.01875
- we fix the number of polymorphic sites (=50) by employing broad uniform priors on θ and accepting only those instances that result in 50 segregating sites
- when the sweep is either recent or old, the discrimination between neutral and selective models becomes problematic
- recent or old selection in populations that have experienced deep bottlenecks cannot be discriminated from neutrality
- when selection has occurred within the bottleneck phase, the false positive rate decreases to 20% and the true positive rate is 73% for the SVM and about 10% lower for the SweepFinder
- higher discrimination performance is achieved when the sweep completes within the bottleneck
- this requires unrealistically high values of s
- in model B (shallow bottleneck), the discrimination performance is slightly better than that of model A
- again the most challenging scenarios are either recent or old sweeps
- the performance increases when the sweep occurs within the bottleneck phase
- distinguishing RHH from neutrality in equilibrium populations
- recurrent selected substitutions occur randomly along a chromosome according to a time-homogeneous Poisson process at a rate v per generation
- well-known patterns of SHH models are modified under RHH
- the SFS is skewed toward the rare variants
- the excess of high-frequency-derived alleles decreases
- Jensen et al. (2007) have shown that it is difficult to separate RHH models from neutrality on the basis of ωMAX-values or site frequency spectrum statistics
- the ω-statistic and SweepFinder are based on the assumption that a single selective sweep has just been completed
- overlapping selective sweeps
- the RHH model we have used describes successive and nonoverlapping selective events
- a most extreme scenario, which describes the appearance of beneficial mutations at the same site, is described as "soft" sweep
- soft sweeps may emerge during the evolution of organisms (e.g., Plasmodium) with high mutation rates
- conversely, they may be of limited importance in the evolution of D. melanogaster or Homo sapiens, for instance
- SFS-based approaches may not work under overlapping selective sweeps
- the frequency of the class of polymorphisms in intermediate frequency may be quite large
- LD-based statistics can be useful
- a multitude of extended haplotypes may exist on the left and right sides of the selected region