incomplete sweep

Booker TR, Jackson BC & Keightley PD 2017 Detecting positive selection in the genome. BMC Biol 15:98.

  • Fig. 1
  • d incomplete/partial sweeps
  • if an advantageous allele increases in frequency, but does not reach fixation, there will still be some loss of linked neutral diversity
  • we use the term incomplete sweeps to describe sweeps that are polymorphic at the time of sampling, but may (or may not) eventually reach fixation
  • the term partial sweep describes the situation wherein a sweeping allele becomes effectively neutral at a certain frequency in its trajectory
  • the magnitude of both processes’ effects on linked neutral diversity depends on the frequency reached by the sweeping allele when selection is ‘turned off’ or on the time of sampling [33]
  • partial sweeps may be common in cases of adaptation involving selection on quantitative traits
  • a notable conclusion from Galtier's study is that average α exceeds 50%, implying that most amino acid substitutions are adaptive in many species
  • primates, notably hominids, are an exception, tending to have lower α, presumably because of their small effective population sizes, leading to the accumulation of slightly deleterious amino acid mutations
  • Sattath et al. [43] [...] found that amino acid substitutions are driven by relatively strongly adaptive mutations
  • s ~ 0.5% and s ~ 0.01%
  • their estimates of the selection strength are therefore in broad agreement with the estimate of s ~ 1% obtained by Macpherson et al. [40]
  • Elyashiv et al. [53] developed a method that fits a model of hard sweeps and background selection to genome-wide variation in nucleotide diversity and divergence
  • for nonsynonymous sites, they found that α = 4.1% for strongly selected mutations (s ≥ 0.03%) and α = 36.3 % for weakly selected mutations (s ~ 0.0003%)
  • the strength of selection on the weakly selected class of beneficial mutations in Elyashiv et al.'s study may be too weak
  • assuming Ne = 106 for D. melanogaster, Nes ~ 3
  • the fixation probability of a newly arising advantageous mutation is very similar to that of a neutral allele
  • such weak selection in D. melanogaster may not necessarily limit the frequency of hard sweeps
  • adaptation in D. malanogaster may be limited by current census population size rather than long-term Ne
  • Elyashiv et al. [53] approach does not incorporate gene conversion, which may have a substantial impact on the effects of sweeps within genes
  • the signatures present in the haplotype structure (for example a skew towards a small number of high frequency haplotypes) generated by positive selection persist for only ~ 0.01 Ne generations, which is an order of magnitude shorter than the persistence time of signatures in the site frequency spectrum
  • haplotype-based tests outperform diversity and site frequency spectrum-based tests at detecting soft sweeps
  • under the soft sweep model, several haplotypes may be carried to high frequency, resulting in characteristic signatures in a population's haplotype structure, while leaving polymorphism less affected
  • soft sweeps arising from multiple de novo mutations require high beneficial mutation rates
  • in the case of soft sweeps from standing variation, even if alleles are segregating at appreciable frequencies in the population before the onset of selection, they may still be more likely to result in a hard sweep than a soft one
  • the signatures of both incomplete and partial selective sweeps left in polymorphism data are less clear than for hard sweeps
  • if polygenic traits are the target of selection, partial sweeps may be common
  • selection can bring about rapid evolution by acting on standing variation at multiple loci, affecting levels of diversity at linked neutral sites
  • a haplotype-based statistic introduced by Field et al. [63] called the singleton density score ('SDS') is able to detect very recent selection, including selection operating on polygenic traits
  • it quantifies the extent to which selection has distorted the genealogy of sampled haplotypes, as measured by the distribution of singleton mutations around ancestral and derived alleles at a focal locus
  • Field et al. provide evidence of selection on multiple polygenic traits, including height, in the ancestors of British people within the last 3000 years
  • recent theoretical work by Jain and Stephan [72] suggests that the allele frequency shifts resulting from polygenic adaptation may be too subtle to be detected using common approaches
  • this depends on the number of loci underlying quantitative traits
  • quantitative traits can respond to selection when loci underlying the trait have Nes < 1
  • biologically grounded simulations using realistic trait architectures and selection regimes are likely necessary to determine how readily polygenic adaptation can be detected using population genomic data
  • hard sweeps produce distinctive patterns of LD
  • this information adds little for detecting hard sweeps when information from diversity and the site frequency spectrum is available
  • it may be useful for distinguishing selection from demographic effects
  • haplotype information is useful, however, when selection is ongoing and/or it does not proceed according to the hard sweep model
  • one drawback of haplotype-based statistics is that they are often descriptive
  • they do not provide a direct means for parameter estimation
  • adaptive evolution is frequent across a variety of species
  • it appears to be driven by strongly selected mutations
  • the application of recently developed tests and models to data from non-model organisms remains a challenge
  • they variously require a population sample for very many individuals, a high quality reference genome and annotations, a genetic map and genome sequences of suitable outgroup species
  • the recent findings of Field et al. [63], Garud et al. [62] and Garud and Petrov [69] all suggest that both partial and soft sweeps may occur frequently
  • a key parameter in the partial sweep model is the frequency that a beneficial mutation reaches before selection is 'switched off'
  • as the critical frequency decreases, the inferred rate of sweeps increases over multiple orders of magnitude [33]