Steinrücken, Matthias; Bhaskar, Anand; Song, Yun S. A novel spectral method for inferring general diploid selection from time series genetic data. (English) Zbl 1454.62405 Ann. Appl. Stat. 8, No. 4, 2203-2222 (2014). Summary: The increased availability of time series genetic variation data from experimental evolution studies and ancient DNA samples has created new opportunities to identify genomic regions under selective pressure and to estimate their associated fitness parameters. However, it is a challenging problem to compute the likelihood of nonneutral models for the population allele frequency dynamics, given the observed temporal DNA data. Here, we develop a novel spectral algorithm to analytically and efficiently integrate over all possible frequency trajectories between consecutive time points. This advance circumvents the limitations of existing methods which require fine-tuning the discretization of the population allele frequency space when numerically approximating requisite integrals. Furthermore, our method is flexible enough to handle general diploid models of selection where the heterozygote and homozygote fitness parameters can take any values, while previous methods focused on only a few restricted models of selection. We demonstrate the utility of our method on simulated data and also apply it to analyze ancient DNA data from genetic loci associated with coat coloration in horses. In contrast to previous studies, our exploration of the full fitness parameter space reveals that a heterozygote advantage form of balancing selection may have been acting on these loci. Cited in 7 Documents MSC: 62P10 Applications of statistics to biology and medical sciences; meta analysis 62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH) 92D10 Genetics and epigenetics Keywords:population genetics; spectral method; transition density function; hidden Markov model × Cite Format Result Cite Review PDF Full Text: DOI arXiv Euclid References: [1] Bollback, J. P., York, T. L. and Nielsen, R. (2008). Estimation of \(2N_{e}s\) from temporal allele frequency data. Genetics 179 497-502. [2] Burke, M. K., Dunham, J. P., Shahrestani, P., Thornton, K. R., Rose, M. R. and Long, A. D. (2010). Genome-wide analysis of a long-term evolution experiment with Drosophila. Nature 467 587-590. [3] Ewens, W. J. (2004). Mathematical Population Genetics : I. Theoretical Introduction , 2nd ed. Springer, New York. · Zbl 1060.92046 [4] Fearnhead, P. (2003). Ancestral processes for non-neutral models of complex diseases. Theor. Popul. Biol. 63 115-130. · Zbl 1104.92043 · doi:10.1016/S0040-5809(02)00049-7 [5] Fearnhead, P. (2006). The stationary distribution of allele frequencies when selection acts at unlinked loci. Theor. Popul. Biol. 70 376-386. · Zbl 1112.92043 · doi:10.1016/j.tpb.2006.02.001 [6] Feder, A. F., Kryazhimskiy, S. and Plotkin, J. B. (2014). Identifying signatures of selection in genetic time series. Genetics 196 509-522. [7] Genz, A. and Joyce, P. (2003). Computation of the normalizing constant for exponentially weighted Dirichlet distribution integrals. Computing Science and Statistics 35 181-212. [8] Green, R. E., Krause, J., Briggs, A. W., Maricic, T., Stenzel, U., Kircher, M., Patterson, N., Li, H., Zhai, W., Fritz, M. H.-Y. et al. (2010). A draft sequence of the Neandertal genome. Science 328 710-722. [9] Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H. and Bustamante, C. D. (2009). Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 5 e1000695. [10] Hummel, S., Schmidt, D., Kremeyer, B., Herrmann, B. and Oppermann, M. (2005). Detection of the CCR5-Delta32 HIV resistance gene in Bronze Age skeletons. Genes Immun. 6 371-374. [11] Lang, G. I., Rice, D. P., Hickman, M. J., Sodergren, E., Weinstock, G. M., Botstein, D. and Desai, M. M. (2013). Pervasive genetic hitchhiking and clonal interference in forty evolving yeast populations. Nature 500 571-574. [12] Ludwig, A., Pruvost, M., Reissmann, M., Benecke, N., Brockmann, G. A., Castaños, P., Cieslak, M., Lippold, S., Llorente, L., Malaspinas, A.-S., Slatkin, M. and Hofreiter, M. (2009). Coat color variation at the beginning of horse domestication. Science 324 485. [13] Lukić, S., Hey, J. and Chen, K. (2011). Non-equilibrium allele frequency spectra via spectral methods. Theor. Popul. Biol. 79 203-219. · Zbl 1338.92079 · doi:10.1016/j.tpb.2011.02.003 [14] Malaspinas, A. S., Malaspinas, O., Evans, S. N. and Slatkin, M. (2012). Estimating allele age and selection coefficient from time-serial data. Genetics 192 599-607. [15] Mathar, R. J. (2009). A Java Math.BigDecimal implementation of core mathematical functions. Available at . arXiv:0908.3030 [16] Mathieson, I. and McVean, G. (2013). Estimating selection coefficients in spatially structured populations from time series data of allele frequencies. Genetics 193 973-984. [17] Orlando, L., Ginolhac, A., Zhang, G., Froese, D., Albrechtsen, A., Stiller, M., Schubert, M., Cappellini, E., Petersen, B., Moltke, I. et al. (2013). Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499 74-78. [18] Orozco-terWengel, P., Kapun, M., Nolte, V., Kofler, R., Flatt, T. and Schlötterer, C. (2012). Adaptation of Drosophila to a novel laboratory environment reveals temporally heterogeneous trajectories of selected alleles. Mol. Ecol. 21 4931-4941. [19] Press, W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (2007). Numerical Recipes : The Art of Scientific Computing , 3rd ed. Cambridge Univ. Press, Cambridge. · Zbl 1132.65001 [20] Reich, D., Green, R. E., Kircher, M., Krause, J., Patterson, N., Durand, E. Y., Viola, B., Briggs, A. W., Stenzel, U., Johnson, P. L. F. et al. (2010). Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468 1053-1060. [21] Shankarappa, R., Margolick, J. B., Gange, S. J., Rodrigo, A. G., Upchurch, D., Farzadegan, H., Gupta, P., Rinaldo, C. R., Learn, G. H., He, X., Huang, X. L. and Mullins, J. I. (1999). Consistent viral evolutionary changes associated with the progression of human immunodeficiency virus type 1 infection. J. Virol. 73 10489-10502. [22] Song, Y. S. and Steinrücken, M. (2012). A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics 190 1117-1129. [23] Steinrücken, M., Bhaskar, A. and Song, Y. (2014). Supplement to “A novel spectral method for inferring general diploid selection from time series genetic data.” . · Zbl 1454.62405 [24] Steinrücken, M., Wang, Y. X. R. and Song, Y. S. (2013). An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selection. Theor. Popul. Biol. 83 1-14. · Zbl 1275.92090 · doi:10.1016/j.tpb.2012.10.006 [25] Stephens, M. and Donnelly, P. (2003). Ancestral inference in population genetics models with selection (with discussion). Aust. N. Z. J. Stat. 45 395-430. · Zbl 1064.62115 · doi:10.1111/1467-842X.00295 [26] Williamson, E. G. and Slatkin, M. (1999). Using maximum likelihood to estimate population size from temporal changes in allele frequencies. Genetics 152 755-761. [27] Wiser, M. J., Ribeck, N. and Lenski, R. E. (2013). Long-term dynamics of adaptation in asexual populations. Science 342 1364-1367. This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.