2021-04-21

hard selection

Szép E, Sachdeva H & Barton NH 2021 Polygenic local adaptation in metapopulations: a stochastic eco-evolutionary model. Evolution, in press.
doi:10.1111/evo.14210

we assume hard selection
population sizes are stochastic but influenced by mean fitness on the island plus local density-dependent regulation
the size n*_i on island i, after selection and regulation, is a Poisson random variable with mean n_iW_ie^{r_i,0(1 − n_i/K_i)}
r_i,0 is the baseline rate of growth
K_i the baseline carrying capacity
n_i the population size prior to selection
W_i the mean genetic fitness on island i
the n*_i offspring are formed by randomly sampling 2n*_i parents (with replacement) from the n_i individuals in proportion to individual fitness, and then creating offspring via free recombination of each pair of parent genotypes
selection is density independent
relative fitness of genotypes is independent of population size

2020-08-18

polygenic adaptation

Stephan W & John S 2020 Polygenic adaptation in a population of finite size. Entropy 22:907.

in recent years, due to the advance of genome-wide association studies (GWAS), polygenic selection was studied using quantitative genetics models that are formulated in terms of allele frequency changes in a large number of loci across the whole genome
selection acts on a phenotypic trait
a genotype-phenotype map is assumed to bridge the gap to population genetics
polygenic adaptation driven by a large number of weakly selected loci is not nearly as well studied as the case of strong positive selection leading to selective sweeps
reviews by Pritchard et al. (2010) [17] and Pritchard and Di Rienzo (2010) [18] drew the attention of population geneticists to this type of selection
these papers predicted that allele frequencies change by small amounts when a large number of genetic loci of minor effect sizes control a phenotypic trait
it is not obvious whether polygenic adaptation may be so fast, as suggested by an increasing number of cases reported in the recent literature
De Vladar and Barton (2014) [29] and Jain and Stephan (2015) [30] used a deterministic model to analyze the dynamics of adaptation after a sudden environmental shift of the fitness optimum of a phenotypic trait in the absense of genetic drift
the equilibrium allele frequencies of the deterministic model do not agree with the frequencies typically observed in GWAS
Simons et al. (2018) [35] proposed a model of selection that simultaneously acts on multiple traits (pleiotropy)
Stetter et al. (2018) [36] and Thornton (2019) [37] used extensive forward simulations to analyze a model (though with relatively few selected loci) that also includes neutral loci linked to selected ones
the model of Höllinger et al. (2019) [33] is different from ours in that the loci controlling a trait are not explicitly given, but instead a genome-wide mutation rate is used as a proxy
we first review the work of John and Stephan (2020) [34] in which we described a stochastic treatment of the equilibrium phase before the shift of the fitness optimum

Equation (19) predicts that after an environmental change the allele frequencies shift coherently into the same direction
this is an important property of polygenic selection
it may help detecting this type of selection, although the frequency shifts at individual loci are in general small
including genetic drift, however, leads to a more complex picture of polygenic adaptation
we find a good agreement between Equation (13) and the simulation for the deviation Δc₁ of the population mean from the optimum within the short-term phase
for the allele frequencies, however, we get a reasonable agreement of the deterministic prediction of Equations (19) and simulations only when the effect sizes are sufficiently large and allele frequencies at the time of the environmental shift are intermediate
the reason is that genetic drift slows down the increase of the allele frequencies and hence reduces the expected differences between the allele frequencies at the end of the short-term phase and those at t = 0
as a consequence, while trait-increasing alleles with intermediately high equilibrium frequencies contribute positively to changes of the trait mean (i.e., are aligned with the direction of the optimum shift), alleles with low or high frequencies may not stay aligned with the optimum shift

alleles with very low or high frequencies are subject to stronger drift and thus may not stay aligned with the direction of the optimum shift

do selective sweeps occur in polygenic adaptation?
selective sweeps may arise in restricted parameter ranges, but only when most alleles have large effects
strong selective fixations have been observed in simulations in the initial rapid phase
fixations driven by relatively weak selection may occur in the prolonged equilibration period, but these would not lead to sweeps
in highly polygenic models large-effect alleles almost never sweep to fixation, while alleles of moderate effects may go to fixation
in general, in quantitative genetics models, selective sweeps are rare
this does not contradict Thornton’s (2019) [37] observation of sweeps in cases in which the trait is not highly polygenic
this was also found by Jain and Stephan (2017a) [31] when the number of loci controlling a trait was not large

2020-08-17

polygenic adaptation

John S & Stephan W 2020 Important role of genetic drift in rapid polygenic adaptation. Ecol Evol 10:1278-1287.

it is not clear if adaptation can occur rapidly via such subtle changes in the allele frequencies

we follow this direction here to understand the evolutionary dynamics of quantitative traits from the standpoint of population genetics

we have found two distinctly different modes of rapid adaptation:
(a) through strong directional selection at a few loci when the effect sizes of the alleles at these loci are large relative to a scaled mutation rate
(b) through weak selection at many individual loci (with small effect sizes) leading to subtle allele frequency shifts in the case of polygenic adaptation
we examine to what extent these deterministic results may be generalized to populations of finite size

dp_i / dt = − sγ_ip_iq_iΔc₁ − (sγ_i² / 2) p_iq_i (q_i − p_i) − μp_i + νq_i, i = 1, ..., l ... (5)
we calculate the allele frequency changes in each locus independently based on the effect size and the allele frequency of that locus
we do binomial sampling with mutation based on allele frequency p_i(t)
we apply selection by drawing a random number from a binomial distribution whose mean is the modulus of the sum of the two selection terms in Equation (5)
this random number is added to or subtracted from the + allele frequency obtained by stochastic sampling (dependent on the sign of the sum of the selection and mutation terms in Equation (5)) to obtain the + allele frequency at locus i in the next generation

in the deterministic system (polygenic case) the trait mean may change much faster after a perturbation than the allele frequencies
after the system is pushed away from the stationary state the trait mean may quickly respond, while the allele frequencies reach the stationary state only very slowly

Δc₁ is a fast variable on the time scale of the allele frequencies p_i
Δc₁ approaches its equilibrium value [...] quickly
the allele frequencies need much longer to reach equilibrium
we obtain [the equilibrium value of Δc₁] by putting the left‐hand side of Equation (6) to zero
we may neglect the skewness term as we focus on loci with small effect sizes γ_i and c₃ is proportional to γ_i³

for large mutation rates, the stationary variance converges to lγ²
this result was also obtained for the deterministic model, for which the equilibrium allele frequencies are 0.5
for small mutation rates, such that 4β≪1, the stationary genetic variance approaches 4βlγ², a value that is much smaller than lγ²
this has important consequences for the speed of polygenic adaptation

theories with very different assumptions about mutation ([...]), all predict that the stationary distribution of the mean deviation from the optimum should have variance 1/(2Ns)
this is a quite generic property of stochastic processes best known for the Ornstein–Uhlenbeck process

the allele frequency shift at a locus depends strongly on the compound parameter γ_ip_i(0)q_i(0)
it increases with the effect size and is greatest for initial frequencies around 0.5
after an environmental change the allele frequencies are expected to shift coherently into the same direction
this appears to be an important property of polygenic selection because it may help detecting this type of selection

2020-08-16

polygenic adaptation

Mathieson I 2020 Human adaptation over the past 40,000 years. Curr Opin Genet Dev 62:97-104.

despite strong evidence of the polygenicity of most human traits, evidence for polygenic selection is weak
its importance in recent human evolution remains unclear

the GWAS on which previous analyses had relied had not fully corrected for the effect of population stratification, leading to overestimation of the effect of selection
how can we trust evidence for other traits, which surely suffer from similar problems?
in 2020, the question of the contribution of polygenic adaptation to human evolution is largely back to where it was in 2010

given that we expect polygenic selection to be common, why is it so hard to find?
widespread pleiotropy might mean that adaptation is driven by shifts in the frequency of a relatively small proportion of loci
adaptation on polygenic traits my be more oligogenic than polygenic

2020-08-05

genealogy

Wakeley J 2020 Developments in coalescent theory from single loci to chromosomes. Theor Popul Biol 133:56-64.

it took some time to understand the temporal structure behind Ewens' formula
working at first outside population genetics, Kingman (1975) introduced the Poisson-Dirichlet distribution, which also applies to allele frequencies in a population under infinite-alleles mutation when the frequencies are ordered largest to smallest

just prior to the introduction of coalescent theory, a closely related forward-time theory of lines of descent was developed by Griffiths (1980)

the comprehensive synthetic work of Tavaré (1984) placed the theories of coalescence, lines of descent and ages of alleles within a single framework
properties of the ancestral process of gene genealogies are obtained from allelic models in the limit as θ tends to zero
this highlighted the earlier work of Felsenstein (1971) which established a recursive equation for sampling probabilities of numbers of alleles at two different time points in the absence of mutation
Felsenstein (1971) considered the probability that i alleles present in the population now will all still be present at some future time
Kimura (1955) had shown previously using diffusion theory that this rate is equal to i(i − 1)/2 on the diffusion time scale
Felsenstein (1971) showed that a genealogical approach based on G gives the same answer
i(i − 1)/2 is also the rate of decay of the probability that i alleles are present in a sample of size i

following Kimura (1955), Felsenstein (1971) considered these results in relation to the rate of loss of i alleles at some distant future time
it is remarkable how close this came to the backward-time coalescent process without mutation, in which i(i − 1)/2 is the total rate of coalescence when there are i ancestral lineages
Felsenstein (1971) did not consider what would now be called the branching structure of the gene genealogy

Watterson (1975) was the first to present gene genealogies and their backward-time construction, through a series of n − 1 independent intervals and with the familiar random scattering of neutral mutations on the branches [of?] the gene genealogy, in a way that unambiguously captures our modern notion of coalescent theory
key aspects of the theory which are missing in Watterson (1975) compared to Kingman (1982a,b,c) are the description of the detailed relationships among n labeled samples, that is the state space of gene genealogies, and the proof of convergence to the coalescent process

2020-08-04

multiple loci

Bürger R 2020 Multilocus population-genetic theory. Theor Popul Biol 133:40-48.

commentary
in linkage equilibrium (LE), allele frequencies at different loci do not change independently
the direction and intensity of selection on a particular locus generally depend on the allele-frequency distribution of its genetic background

I focus on deterministic multilocus models
the first two-locus model with selection and explicit recombination was designed by Kimura (1956), who analyzed the conjecture by Fisher (1930) that selection can lead to tighter linkage
the general two-locus two-allele (TLTA) model for the dynamics under selection and recombination was derived by Lewontin and Kojima (1960)
also the most important ingredients, such as LD, LE, and epistasis are defined for the TLTA case

in the absence of selection, global convergence to LE was proved much earlier by Geiringer (1944)

under the assumption of LE, i.e., D(t) ≡ 0, which yields the gradient-like selection dynamics for two independent loci, coupled only by mean fitness, one equilibrium configuration (in the sense of topological equivalence) may admit several topologically nonequivalent flows

an important dynamical property of single-locus systems is that mean fitness is nondecreasing along trajectories and constant only at equilibrium
this is a special case of the classical interpretation of Fisher's (1930) Fundamental Theorem of Natural Selection
for multilocus systems, mean fitness increases if loci contribute additively to fitness
with epistasis mean fitness may decrease
for weak selection or weak epistasis, LD decays rapidly to small values as time proceeds
once this so-called state of quasi-linkage equilibrium has been reached, mean fitness may increase for a long time
all linkage disequilibria decay rapidly to order s
their change per generation is only O(s²)
the per-generation change ΔW of mean fitness is, to leading order in s, W⁻¹V_g
V_g is the additive genetic variance in fitness
Fisher strongly opposed the interpretation of his Fundamental Theorem of Natural Selection (Fisher, 1930) that selection acts to maximize mean fitness
an interpretation that has been prevalent for a long time
Ewens and Lessard (2015) discuss interpretations of the Fundamental Theorem that indeed are very general (by admitting multiple loci, epistasis, strong selection, and also nonrandom mating), but that concern certain partial changes in mean fitness

Turelli (1984) refuted Lande's (1975) claim that multilocus mutation-selection balance may provide a universal explanation for a large fraction of the empirically observed levels of quantitative genetic variation
this initiated a controversy about the relevance of mutation-selection balance as an explanation for genetic variation in quantitative traits

my personal entry into population genetics was through Feldman and Karlin (1971)
the first population genetics paper I studied

among the most interesting phenomena, both for empiricists and for theoreticians, are clines in phenotype or genotype frequencies
two- or multilocus studies, initiated by Slatkin (1975) for spatially varying selection and by Barton (1983) for hybrid zones, are quite rare, presumably because of the severe technical difficulties that arise
the multilocus work on clines has been applied successfully to advance our understanding of natural hybrid zones and the underlying genomics

a proper treatment of these topics as well as of the adaptation of quantitative traits, including the detection of the resulting polygenic footprints, must ultimately be based on stochastic multilocus models
Höllinger et al. (2019) derived the stochastic patterns of allele-frequency change at individual loci in a finite population
this work initiates a synthesis between the population-genetic and the quantitative-genetic views of adaptation
multilocus models including the effects of drift, especially on neutral markers, are increasingly often combined with genomic data

2020-08-01

evolutionary rescue

Osmond MM, Otto SP & Martin G 2020 Genetic paths to evolutionary rescue and the distribution of fitness effects along them. Genetics 214:493-510.

in multi-step rescue, intermediate genotypes that themselves go extinct provide a "springboard" to rescue genotypes
our approach allows us to quantify how a race between evolution and extinction leads to a genetic basis of adaptation that is composed of fewer loci of larger effect
we hope this work brings awareness to the impact of demography on the genetic basis of adaptation

a recurrent observation, especially in experimental evolution with asexual microbes, is that the more novel the environment and the stronger the selection pressure, the more likely it is that adaptation primarily proceeds by fewer mutations of larger effect
i.e., that adaptation is oligogenic sensu Bell 2009
an extreme case is the evolution of drug resistance, which is often achieved by just one or two mutations

we therefore largely lack a theoretical framework for the genetic basis of evolutionary rescue that captures the arguably more realistic situation where an intermediate number of mutations are at play
the existence of a more complete framework could therefore provide valuable information for those investigating the genetic basis of drug resistance (e.g., the expected number and effect sizes of mutations) and would extend our understanding of the genetic basis of adaptation to cases of nonequilibrial demography (i.e., rapid evolution and "eco-evo" dynamics)

resistance often appears to arise by a single mutation
but not always
the fitness effect of rescue genotypes is more often large than small, creating a hump-shaped distribution of selection coefficients

we use Fisher's geometric model to describe adaptation following an abrupt environmental change that instigates population decline
here (1) the dynamics of each genotype depends on their absolute fitness (instead of only on their relative fitness)
(2) multiple mutations can segregate simultaneously (instead of assuming only sequential fixation), allowing multiple mutations to fix—and in our case, rescue—the population together as a single haplotype
i.e., stochastic tunnelling, Iwasa et al. 2004b
variation in absolute fitness, which allows population size to vary, can create feedbacks between demography and evolution
we also explore the possibility of rescue by mutant haplotypes containing more than one mutation
we ask:
(1) how many mutational steps is evolutionary rescue likely to take
(2) what is the expected distribution of fitness effects of the surviving genotypes and their component mutations?

we ignore environmental effects
the phenotype is the breeding value
we use the isotropic version of Fisher's geometric model
mutations (in addition to selection) are assumed to be uncorrelated across the scaled traits
each mutation affects all scaled phenotypes
we use the "classic" form of Fisher's geometric model (Harmand et al. 2017)
the probability density function of a mutant phenotype is multivariate normal, centered on the current phenotype, with variance λ in each dimension and no covariance
using a probability density function of mutant phenotypes implies a continuum-of-alleles (Kimura 1965)
phenotype is continuous and each mutation is unique
mutations are assumed to be additive in phenotype, which induces epistasis in fitness (as well as dominance under diploid selection), as fitness is a nonlinear function of phenotype
we assume asexual reproduction, i.e., no recombination, which is appropriate for many cases of antimicrobial drug resistance and experimental evolution, while recognizing the value of expanding this work to sexual populations
whether anisotropy can be reduced to isotropy with fewer dimensions in the case of evolutionary rescue, where the tails are essential, is unknown

when the mutation rate, U, is substantially less than a critical value, U_C = λn²/4, we are in a "strong selection, weak mutation" regime
essentially all mutations arise on a wild-type background (Martin and Roques 2016), consistent with the House of Cards approximation (Turelli 1984, 1985)
in this regime, rescue tends to occur by a single mutation of large effect
when U ≫ U_C, we are in a "weak selection, strong mutation" regime
many cosegregating mutations are present within each genome, creating a multivariate normal phenotypic distribution (Martin and Roques 2016), consistent with the Gaussian approximation (Kimura 1965; Lande 1980)
in this regime, rescue tends to occur by many mutations of small effect

our prediction, that rescue by more de novo mutations can be more likely than rescue by fewer, is novel
the general conclusion has been that, since the probability of rescue scales with U^k (where U is the mutation rate and k is the minimum number of mutations required for rescue), the probability of rescue declines with the number of mutations
when the probability of a beneficial mutation arising declines with its selective advantage, the probability of sampling once from the extreme tail of the DFE can be lower than sampling multiple mutations closer to the bulk of the DFE
rescue via multiple mutations can become the dominant path
rescue by multiple mutations may also be more likely with standing genetic variation, as small-effect intermediate mutations may segregate at higher frequencies than large-effect rescue mutations before the environmental change
this is especially true with recombination, where rescue genotypes can arise from segregating intermediate mutations without mutation (Uecker and Hermisson 2016)

we have investigated the genetic basis of evolutionary rescue in an asexual population that is initially genetically uniform
extending this work to allow for recombination and standing genetic variation at the time of environmental change—as expected for many natural populations—would be valuable
the effect of standing genetic variance on one-step rescue might be incorporated by a simple rescaling of N₀, to account for the additional mutants present in the standing variation
allowing these standing genetic variants to be springboards to multi-step rescue will help clarify the role of standing genetic variation on the genetic basis of rescue more generally
recombination can help combine such springboard mutations into rescue genotypes but will also break these combinations apart, as demonstrated in a two-locus two-allele model of rescue (Uecker and Hermisson 2016)
also left unexplored is the effect of density-dependent fitness
combining density-dependence and standing genetic variance is known to create complex dynamics in a one-locus two-allele model of rescue (Uecker et al. 2014)