population expansion

Coventry A, Bull-Otterson LM, Liu X, Clark AG, Maxwell TJ, Crosby J Hixson JE, Rea TJ, Muzny DM, Lewis LR, Wheeler DA, Sabo A, Lusk C, Weiss KG, Akbar H, Cree A, Hawes AC, Newsham I, Varghese RT, Villasana D, Gross S, Joshi V, Santibanez J, Morrgan M, Chang K, Hale W IV, Templeton AR, Boerwinkle E, Gibbs R & Sing CF 2010 Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat Commun 1:131.

  • an inherent problem with such a deep resequencing effort is distinguishing actual rare genetic variants from stochastic sequencing errors, which will occur at almost every site if enough individuals are resequenced
  • for large-scale studies, there are so many rare single-nucleotide polymorphism (SNP) calls that regenotyping all of them becomes cost prohibitive
  • to catalogue rare variants in a thorough and cost-effective manner, in this study we assign probabilities to genotype calls, explicitly estimating our uncertainty for each call
  • we selected genes KCNJ11 and HHEX for resequencing in 13,715 individuals
  • such a large sample made some unique population-genetic calculations possible, such as a model of the growth rate of the European population over the last few thousand years
  • by estimating the distribution of times at which a variant of a given contemporary frequency might have plausibly arisen in the ancestral population, we have been able to compare our growth-rate estimate with earlier estimates
  • we were also able to separately estimate mutation rate and demographic parameters, which are normally confounded in equilibrium population genetics
  • we fit a model of exponential growth to the SFS (see Fig. 3) of our European-American sample
  • the excess of rare variants in HHEX and KCNJ11 fits well with this model (Fig. 3), giving a mean posterior growth rate of 1.094 (that is, an increase of 9.4%) per generation
  • the variance in this estimate is high
  • our growth-rate estimate gives a clear genetic signal that over the last few millennia, the rate of population expansion has accelerated substantially
  • previous genomic studies of human population samples have been based on either resequencing a small group of individuals or on HapMap SNPs ascertained with a bias towards common variation
  • these have only captured the distribution of common variants (relative minor allele frequency ~0.05)
  • most of the variants in this part of the frequency spectrum arose about 100–3,000 generations ago, or about 2,500–75,000 years ago
  • in the exceptionally large sample resequenced here, singletons correspond to mutations that arose during the last ~100 generations (Fig. 4b), and thus carry information about the demographics of Europe after its widespread adoption of agriculture
  • earlier studies10 also found a good fit to an exponential growth model, but with the substantially lower modal growth rate of 1.004 per generation
  • our posterior distribution implies that the growth rate is bound below by 1.015
  • Europe's population growth rate accelerated substantially over the last 2,000 years
  • the growth rate in Europe since 1600 has been ~11.5% per generation
  • in future, even deeper resequencing efforts will reveal an SFS with even greater proportions of rare and missense variants with potential consequences for human health
  • a simple calculation assuming a conservative mutation rate of 1 × 10−9 still implies that the human genome of ~3 × 109 sites is saturated with mutations arising just in the current human generation of 6.7 × 109 people
  • this supports the concerns raised by Lynch21 regarding burgeoning human 'mutational load' and bears on the 'missing heritability' still unexplained by genome-wide association studies
  • methods
  • population genetics calculations
  • to minimize the complications of admixture, we restricted this analysis to the European-American sample
  • our model thus has three parameters
  • the mutation rate μ
  • the estimated population size at the start of ARIC, N
  • the growth rate during the exponential phase r
  • we did a grid search over these parameters
  • because we were comparing with the growth-rate estimate in Gutenkunst et al.,10 of 1.004 per generation, we also separately computed the following likelihoods for r in the vicinity of that, and found them to be negligible
  • supplementary methods
  • we assumed the ancient demographic parameters estimated for the European lineage in (36), followed by exponential growth
  • the specific Schaffner parameters we used were an initial ancestral effective population size of 12,500, followed 17,000 generations ago by growth in Africa to an effective population size of 24,000, then migration to Europe 3,500 generations ago, for an effective population size of 7,700
  • the following estimates are not very sensitive to uncertainties in our ancient demographic assumptions
  • we assumed that at some point the population began to grow exponentially until the time of the ARIC study's inception
  • we calculated the onset of the exponential growth phase by working backwards from the assumed contemporary population size and growth rate to the Schaffner size of 7,700