geographic structure

Lascoux M & Petit RJ 2010 The 'New Wave' in plant demographic inference: more loci and more individuals. Mol Ecol 19:1075-1078.

  • Instruct is an extension of Structure that eliminates the assumption of Hardy-Weinberg equilibrium within clusters
  • Instruct uses a different statistical approach to estimate the most likely number of clusters, K
  • a notoriously difficult problem (Guillot et al. 2009)
  • with Structure the inference of the number of clusters relies on the posterior distribution Pr(X/K), where X denotes the vector of observed genotypes
  • this posterior is an ad hoc and computationally convenient approximation
  • Instruct infers the optimal number of clusters via the Deviance Information Criterion
  • the biological interpretation of K is not always straightforward and that 'it usually makes sense to focus on values of K that capture most of the structure in the data and that seem biologically sensible' (Pritchard et al. 2009)
  • using Structure, eight populations are inferred with the ad hoc method of Pritchard et al. (2000) and two with the method of Evanno et al. (2005)
  • using Instruct and the corresponding criterion to select the most likely number of clusters, three populations are inferred
  • the latter solution was that preferred by the authors
  • why?
  • Instruct uses a more formal decision criteria
  • this division in three populations fits well with the results of a multivariate analysis based on the SNP data
  • the solution K = 3 'captures most of the biologically relevant information of the data'
  • the latter might seem a rather subjective and weak argument
  • but probably reflects the attitude most would adopt implicitly when faced with such a situation
  • it illustrates, if needed, the strength and limitations of Bayesian clustering methods
  • especially in the case of more or less continuous populations
  • since the notion of subpopulation is a theoretical construct that only imperfectly reflects reality, it is clear that the problem of estimating the number of subpopulations will never satisfactorily be resolved