zbMATH — the first resource for mathematics

Demographic inference using genetic data from a single individual: separating population size variation from population structure. (English) Zbl 1342.91033
Summary: The rapid development of sequencing technologies represents new opportunities for population genetics research. It is expected that genomic data will increase our ability to reconstruct the history of populations. While this increase in genetic information will likely help biologists and anthropologists to reconstruct the demographic history of populations, it also represents new challenges. Recent work has shown that structured populations generate signals of population size change. As a consequence it is often difficult to determine whether demographic events such as expansions or contractions (bottlenecks) inferred from genetic data are real or due to the fact that populations are structured in nature. Given that few inferential methods allow us to account for that structure, and that genomic data will necessarily increase the precision of parameter estimates, it is important to develop new approaches. In the present study we analyze two demographic models. The first is a model of instantaneous population size change whereas the second is the classical symmetric island model. We (i) re-derive the distribution of coalescence times under the two models for a sample of size two, (ii) use a maximum likelihood approach to estimate the parameters of these models (iii) validate this estimation procedure under a wide array of parameter combinations, (iv) implement and validate a model rejection procedure by using a Kolmogorov-Smirnov test, and a model choice procedure based on the AIC, and (v) derive the explicit distribution for the number of differences between two non-recombining sequences. Altogether we show that it is possible to estimate parameters under several models and perform efficient model choice using genetic data from a single diploid individual.

91D20 Mathematical geography and demography
92D10 Genetics and epigenetics
Full Text: DOI
[1] Akaike, H., A new look at the statistical model identification, IEEE Trans. Automat. Control, 19, 6, 716-723, (1974) · Zbl 0314.62039
[2] Beaumont, M. A., Detecting population expansion and decline using microsatellites, Genetics, 153, 4, 2013-2029, (1999)
[3] Beaumont, M. A., Approximate Bayesian computation in evolution and ecology, Annu. Rev. Ecol. Evol. Syst., 41, 379-406, (2010)
[4] Beaumont, M. A.; Zhang, W.; Balding, D. J., Approximate Bayesian computation in population genetics, Genetics, 162, 4, 2025-2035, (2002)
[5] Broquet, T.; Angelone, S.; Jaquiery, J.; Joly, P.; Lena, J.-P.; Lengagne, T.; Plenet, S.; Luquet, E.; Perrin, N., Genetic bottlenecks driven by population disconnection, Conserv. Biol., 24, 6, 1596-1605, (2010)
[6] Chen, C.; Durand, E.; Forbes, F.; François, O., Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study, Mol. Ecol. Notes, 7, 5, 747-756, (2007)
[7] Chikhi, L.; Sousa, V. C.; Luisi, P.; Goossens, B.; Beaumont, M. A., The confounding effects of population structure, genetic diversity and the sampling scheme on the detection and quantification of population size changes, Genetics, 186, 3, 983-995, (2010)
[8] Corander, J.; Waldmann, P.; Marttinen, P.; Sillanpää, M. J., BAPS 2: enhanced possibilities for the analysis of genetic population structure, Bioinformatics, 20, 15, 2363-2369, (2004)
[9] Cornuet, J.-M.; Santos, F.; Beaumont, M. A.; Robert, C. P.; Marin, J.-M.; Balding, D. J.; Guillemaud, T.; Estoup, A., Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation, Bioinformatics, 24, 23, 2713-2719, (2008)
[10] Currat, M.; Excoffier, L.; Maddison, W.; Otto, S. P.; Ray, N.; Whitlock, M. C.; Yeaman, S., Comment on “ongoing adaptive evolution of ASPM, a brain size determinant in homo sapiens“ and “microcephalin, a gene regulating brain size, continues to evolve adaptively in humans”, Science, 313, 5784, 172, (2006)
[11] Donnelly, P.; Tavaré, S., Coalescents and genealogical structure under neutrality, Annu. Rev. Genet., 29, 1, 401-421, (1995)
[12] Durrett, R., Probability models for DNA sequence evolution, (2008), Springer · Zbl 1311.92007
[13] Girod, C.; Vitalis, R.; Leblois, R.; Fréville, H., Inferring population decline and expansion from microsatellite data: A simulation-based evaluation of the msvar method, Genetics, 188, 1, 165-179, (2011), URL: http://www.genetics.org/content/188/1/165.abstract
[14] Goossens, B.; Chikhi, L.; Ancrenaz, M.; Lackman-Ancrenaz, I.; Andau, P.; Bruford, M. W., Genetic signature of anthropogenic population collapse in orang-utans, PLoS Biol., 4, 2, e25, (2006)
[15] Griffiths, R., The number of heterozygous loci between two randomly chosen completely linked sequences of loci in two subdivided population models, J. Math. Biol., 12, 2, 251-261, (1981) · Zbl 0455.92008
[16] Guillot, G.; Mortier, F.; Estoup, A., GENELAND: a computer package for landscape genetics, Mol. Ecol. Notes, 5, 3, 712-715, (2005)
[17] Hallatschek, O.; Fisher, D. S., Acceleration of evolutionary spread by long-range dispersal, Proc. Natl. Acad. Sci., 111, 46, E4911-E4919, (2014)
[18] Heller, R.; Chikhi, L.; Siegismund, H. R., The confounding effect of population structure on Bayesian skyline plot inferences of demographic history, PLoS One, 8, 5, e62992, (2013)
[19] Herbots, H. M.J. D., Stochastic models in population genetics: genealogy and genetic differentiation in structured populations, (1994), (Ph.D. thesis)
[20] Hudson, R. R., Generating samples under a wright-Fisher neutral model of genetic variation, Bioinformatics, 18, 2, 337-338, (2002), URL: http://bioinformatics.oxfordjournals.org/content/18/2/337.abstract
[21] Hudson, R. R., Gene genealogies and the coalescent process, Oxf. Surv. Evol. Biol., 7, 1, 44, (1990)
[22] Jones, E., Oliphant, T., Peterson, P., et al. 2001. SciPy: Open source scientific tools for Python. http://www.scipy.org/ [Online; accessed 18.11.14].
[23] Kimura, M.; Weiss, G. H., The stepping stone model of population structure and the decrease of genetic correlation with distance, Genetics, 49, 4, 561-576, (1964)
[24] Leblois, R.; Estoup, A.; Streiff, R., Genetics of recent habitat contraction and reduction in population size: does isolation by distance matter?, Mol. Ecol., 15, 12, 3601-3615, (2006)
[25] Leblois, R.; Estoup, A.; Streiff, R., Genetics of recent habitat contraction and reduction in population size: does isolation by distance matter?, Mol. Ecol., 15, 12, 3601-3615, (2006)
[26] Li, H.; Durbin, R., Inference of human population history from individual whole-genome sequences, Nature, 475, 7357, 493-496, (2011)
[27] McManus, K. F.; Kelley, J. L.; Song, S.; Veeramah, K.; Woerner, A. E.; Stevison, L. S.; Ryder, O. A.; Kidd, J. M.; Wall, J. D.; Bustamante, C. D.; Hammer, M., Inference of gorilla demographic and selective history from whole genome sequence data, Mol. Biol. Evol., 600-612, (2015)
[28] Nei, M.; Takahata, N., Effective population size, genetic diversity, and coalescence time in subdivided populations, J. Mol. Evol., 37, 3, 240-244, (1993)
[29] Nelder, J. A.; Mead, R., A simplex method for function minimization, Comput. J., 7, 4, 308-313, (1965) · Zbl 0229.65053
[30] Nielsen, R.; Beaumont, M. A., Statistical inferences in phylogeography, Mol. Ecol., 18, 6, 1034-1047, (2009)
[31] Olivieri, G. L.; Sousa, V.; Chikhi, L.; Radespiel, U., From genetic diversity and structure to conservation: genetic signature of recent population declines in three mouse lemur species (microcebus spp.), Biol. Conserv., 141, 5, 1257-1271, (2008)
[32] Paz-Vinas, I.; Quéméré, E.; Chikhi, L.; Loot, G.; Blanchet, S., The demographic history of populations experiencing asymmetric gene flow: combining simulated and empirical data, Mol. Ecol., 22, 12, 3279-3291, (2013)
[33] Peter, B. M.; Wegmann, D.; Excoffier, L., Distinguishing between population bottleneck and population subdivision by a Bayesian model choice procedure, Mol. Ecol., 19, 21, 4648-4660, (2010)
[34] Pritchard, J. K.; Stephens, M.; Donnelly, P., Inference of population structure using multilocus genotype data, Genetics, 155, 2, 945-959, (2000)
[35] Quéméré, E.; Amelot, X.; Pierson, J.; Crouau-Roy, B.; Chikhi, L., Genetic data suggest a natural prehuman origin of open habitats in northern madagascar and question the deforestation narrative in this region, Proc. Natl. Acad. Sci., 109, 32, 13028-13033, (2012)
[36] Rogers, A. R.; Harpending, H., Population growth makes waves in the distribution of pairwise genetic differences, Mol. Biol. Evol., 9, 3, 552-569, (1992)
[37] Salmona, J.; Salamolard, M.; Fouillot, D.; Ghestemme, T.; Larose, J.; Centon, J.-F.; Sousa, V.; Dawson, D. A.; Thebaud, C.; Chikhi, L., Signature of a pre-human population decline in the critically endangered Reunion Island endemic forest bird coracina newtoni, PLoS One, 7, 8, e43524, (2012)
[38] Schiffels, S.; Durbin, R., Inferring human population size and separation history from multiple genome sequences, Nature Genet., (2014)
[39] Sousa, V. C.; Beaumont, M. A.; Fernandes, P.; Coelho, M. M.; Chikhi, L., Population divergence with or without admixture: selecting models using an ABC approach, Heredity, 108, 5, 521-530, (2012), URL: http://dx.doi.org/10.1038/hdy.2011.116
[40] Städler, T.; Haubold, B.; Merino, C.; Stephan, W.; Pfaffelhuber, P., The impact of sampling schemes on the site frequency spectrum in nonequilibrium subdivided populations, Genetics, 182, 1, 205-216, (2009)
[41] Tavaré, S., Part I: ancestral inference in population genetics, (Lectures on Probability Theory and Statistics, (2004), Springer), 1-188 · Zbl 1062.92046
[42] Vitti, J. J.; Grossman, S. R.; Sabeti, P. C., Detecting natural selection in genomic data, Annu. Rev. Genet., 47, 97-120, (2013)
[43] Wakeley, J., Nonequilibrium migration in human history, Genetics, 153, 4, 1863-1871, (1999)
[44] Wright, S., Evolution in Mendelian populations, Genetics, 16, 2, 97, (1931)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.