population expansion

Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P, Verzilli C, Shen J, Tang Z, Bacanu S-A, Fraser D, Warren L, Aponte J, Zawistowski M, Liu X, Zhang H, Zhang Y, Li J, Li Y, Li L, Woollard P, Topp S, Hall MD, Nangle K, Wang J, Abecasis G, Cardon LR, Zöllner S, Whittaker JC, Chissoe SL, Novembre J & Mooser V 2012 An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337:100-104.

  • because of rapid population growth and weak purifying selection, human populations harbor an abundance of rare variants, many of which are deleterious and have relevance to understanding disease risk
  • these patterns are at odds with notions that human genetic diversity can be summarized by use of an effective population size (Ne) of 10,000 individuals
  • an Ne of 10,000 individuals is predictive of the average pairwise differences between human sequences (Table 1, θπ) and is reflective of our emergence from a small population in Africa
  • the excess of rare variants observed here (θW >> θπ) is a signature of the rapid growth and large population sizes that typify more recent human demographic history
  • we fit a demographic model to the fourfold degenerate synonymous (S) variants in Europeans
  • we obtained a maximum-likelihood estimate for a recent growth rate of 1.7% [95% confidence interval (CI) = 1.2 to 2.3%] and a recent European effective population size of 4.0 million (95% CI = 2.5 million to 5.0 million)
  • the genes in this study are under stronger purifying selection, which is consistent with their choice as drug targets and importance to human health
  • our results cannot be simply extrapolated to the whole exome
  • our inference of demographic parameters and mutation rates ignores the effects of background selection on synonymous variants
  • supplementary materials
  • demographic history and mutation rate inference
  • we followed the basic approach of Coventry et al. (8)
  • this approach extends the demographic model of Schaffner et al. (69) to include a period of exponential growth in European population size that is parameterized by the current effective size of Europeans, N, the recent growth rate in the European population r and gene specific mutation rates μ
  • the European expansion time is determined by solving for the time at which the ancestral European population of size 7,700 (from the Schaffner model) would need to start growing at rate r to reach a current size of N
  • the posterior mean estimates were dependent on the choice of parameter grid points
  • the Coventry method assigns a uniform prior over the parameter grid points, which will give different posterior distributions when, for example, the grid range is extended, or when the grid points for a parameter are placed on a logarithmic scale
  • we chose to follow a strict maximum likelihood approach (retaining the Monte Carlo approximation to the likelihood of Coventry et al.), which gives estimates of N, r, and per-gene mutation rates that are more robust to the choice of grid points
  • the maximum likelihood estimate for N was 4.0 million with a 2 log-likelihood profile likelihood confidence interval of (2.5 × 106, 5.0 × 106) and for r was 1.017 (CI = 1.012, 1.023)