×

Discrete coalescent trees. (English) Zbl 1475.92116

Summary: In many phylogenetic applications, such as cancer and virus evolution, time trees, evolutionary histories where speciation events are timed, are inferred. Of particular interest are clock-like trees, where all leaves are sampled at the same time and have equal distance to the root. One popular approach to model clock-like trees is coalescent theory, which is used in various tree inference software packages. Methodologically, phylogenetic inference methods require a tree space over which the inference is performed, and the geometry of this space plays an important role in statistical and computational aspects of tree inference algorithms. It has recently been shown that coalescent tree spaces possess a unique geometry, different from that of classical phylogenetic tree spaces. Here we introduce and study a space of discrete coalescent trees. They assume that time is discrete, which is natural in many computational applications. This tree space is a generalisation of the previously studied ranked nearest neighbour interchange space, and is built upon tree-rearrangement operations. We generalise existing results about ranked trees, including an algorithm for computing distances in polynomial time, and in particular provide new results for both the space of discrete coalescent trees and the space of ranked trees. We establish several geometrical properties of these spaces and show how these properties impact various algorithms used in phylogenetic analyses. Our tree space is a discretisation of a previously introduced time tree space, called \(t\)-space, and hence our results can be used to approximate solutions to various open problems in \(t\)-space.

MSC:

92D15 Problems related to evolution
05C85 Graph algorithms (graph-theoretic aspects)
68Q25 Analysis of algorithms and problem complexity
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Baroni, M.; Semple, C.; Steel, M., Hybrids in real time, Syst Biol, 55, 1, 46-56 (2006) · doi:10.1080/10635150500431197
[2] Billera, LJ; Holmes, SP; Vogtmann, K., Geometry of the space of phylogenetic trees, Adv Appl Math, 27, 4, 733-767 (2001) · Zbl 0995.92035 · doi:10.1006/aama.2001.0759
[3] Bordewich, M.; Semple, C., On the computational complexity of the rooted subtree prune and Regraft distance, Ann Comb, 8, 4, 409-423 (2005) · Zbl 1059.05035 · doi:10.1007/s00026-004-0229-z
[4] Bouckaert, R., BEAST 2: a software platform for Bayesian evolutionary analysis, PLoS Comput Biol, 10, 4, e1003537 (2014) · doi:10.1371/journal.pcbi.1003537
[5] Chan TM, Pătraşcu M (2010) Counting inversions, offline orthogonal range counting, and related problems, pp 161-173 · Zbl 1288.68105
[6] Collienne, L.; Gavryushkin, A., Computing nearest neighbour interchange distances between ranked phylogenetic trees, J Math Biol, 82, 1, 8 (2021) · Zbl 1511.92043 · doi:10.1007/s00285-021-01567-5
[7] Cueto, MA; Matsen, FA, Polyhedral geometry of phylogenetic rogue taxa, Bull Math Biol, 73, 6, 1202-1226 (2011) · Zbl 1215.92047 · doi:10.1007/s11538-010-9556-x
[8] Dasgupta B et al (2000) On computing the nearest neighbor interchange distance. In: Discrete mathematical problems with medical applications: DIMACS workshop discrete mathematical problems with medical applications, December 8-10, 1999, vol. 55. DIMACS Center, American Mathematical Soc., p 19 · Zbl 1133.92347
[9] Drummond, AJ, Bayesian coalescent inference of past population dynamics from molecular sequences, Mol Biol Evol, 22, 5, 1185-1192 (2005) · doi:10.1093/molbev/msi103
[10] Gavryushkin, A.; Drummond, AJ, The space of ultrametric phylogenetic trees, J Theor Biol, 403, 197-208 (2016) · Zbl 1343.92343 · doi:10.1016/j.jtbi.2016.05.001
[11] Gavryushkin, A.; Whidden, C.; Matsen, FA, The combinatorics of discrete time-trees: theory and open problems, J Math Biol, 76, 5, 1101-1121 (2018) · Zbl 1391.92030 · doi:10.1007/s00285-017-1167-9
[12] Hudson, RR, Gene genealogies and the coalescent process, Oxf Surv Evol Biol, 7, 1, 44 (1990)
[13] Kawahara, J.; Saitoh, T.; Yoshinaka, R., The time complexity of the token swapping problem and its parallel variants. Algorithms and computation, 448-459 (2017), Berlin: Springer, Berlin · Zbl 1485.68108
[14] Kingman, JFC, The coalescent, Stochastic Process Appl, 13, 3, 235-248 (1982) · Zbl 0491.60076 · doi:10.1016/0304-4149(82)90011-4
[15] Kozlov, AM, RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference, Bioinformatics, 35, 21, 4453-4455 (2019) · doi:10.1093/bioinformatics/btz305
[16] Kuhner, MK; Yamato, J.; Felsenstein, J., Maximum likelihood estimation of population growth rates based on the coalescent, Genetics, 149, 1, 429-434 (1998) · doi:10.1093/genetics/149.1.429
[17] Kuhner, MK, Coalescent genealogy samplers: windows into population history, Trends Ecol Evol, 24, 2, 86-93 (2009) · doi:10.1016/j.tree.2008.09.007
[18] Kumar, S.; Hedges, SB, Advances in time estimation methods for molecular data, Mol Biol Evol, 33, 4, 863-869 (2016) · doi:10.1093/molbev/msw026
[19] Li, M.; Tromp, J.; Zhang, L., Some notes on the nearest neighbour interchange distance. Computing and combinatorics, 343-351 (1996), Berlin: Springer, Berlin
[20] Miller, E.; Owen, M.; Provan, JS, Polyhedral computational geometry for averaging metric phylogenetic trees, Adv Appl Math, 68, 51-91 (2015) · Zbl 1329.68266 · doi:10.1016/j.aam.2015.04.002
[21] Nguyen, LT, IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies, Mol Biol Evol, 32, 1, 268-274 (2015) · doi:10.1093/molbev/msu300
[22] Ohtsuki H, Innan H (2017) Forward and backward evolutionary processes and allele frequency spectrum in a cancer cell population. Theor Popul Biol · Zbl 1393.92035
[23] Posada, D., Cell coal: coalescent simulation of single-cell sequencing samples, Mol Biol Evol, 37, 5, 1535-1542 (2020) · doi:10.1093/molbev/msaa025
[24] Ronquist, F.; Huelsenbeck, JP, MrBayes 3: bayesian phylogenetic inference under mixed models, Bioinformatics, 19, 12, 1572-1574 (2003) · doi:10.1093/bioinformatics/btg180
[25] Semple, C.; Steel, M., Phylogenetics (2003), Oxford: Oxford University Press, Oxford · Zbl 1043.92026
[26] Suchard, MA, Bayesian phylogenetic and phylodynamic data integration using BEAST 110, Virus Evol, 4, 1, 016 (2018) · doi:10.1093/ve/vey016
[27] Tamura, K., MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods, Mol Biol Evol, 28, 10, 2731-2739 (2011) · doi:10.1093/molbev/msr121
[28] Zuckerkandl E, Pauling L (1965) Evolutionary divergence and convergence in proteins, pp 97-166
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.