×

Phase transitions in phylogeny. (English) Zbl 1041.92018

The paper studies a model describing evolution of species. The model is defined on a tree \(T = (V,E)\) rooted at \(\rho \in V\), each vertex of which \(v \in V\) bears a color \(\sigma_v\) chosen from a finite set \(\mathcal{A}\). The root color is chosen in accordance with some initial distribution \(\text{ Prob}(\sigma_\rho = i) = \pi_i\). The mutation of colors along the edge \(e\in E\) is described by a stochastic matrix \(M^e\). These elements determine a Markov field on \(T\). A vertex \(v\in V\) is called a leaf if it has degree one in \(T\); furthermore, \(\partial T\) stands for the set of all such leaves. The vertices which are not the leaves are called internal. It is assumed that all the internal vertices are of degree at least three and that \(\rho\) is an internal vertex. The topology of \(T\) is defined to be the set of pairwise graph-metric distances between the leaves. Let \(\sigma_{\partial T}^t\), \(t = 1, \dots , k\), \(k\in \mathbb{N}\), be independent samples of the configurations born by the leaves.
The paper addresses the problem of ‘reconstructing’ the topology of \(T\) from \(k\) samples with probability \(1-\delta\) for given \(n \in \mathbb{N}\) and \(\delta \in (0, 1)\). This problem is solved for a particular case of the model – the so called balanced Cavender-Farris-Neyman (CFN) model, for which \(| \mathcal{A}| =2\), all the leaves are at the same distance from the root, all the internal vertices are of the same degree \(b\geq 3\), and \(M^e = (M^e_{ij})_{2\times 2}\), \(M^e_{11} = M^{e}_{22} = 1 - \epsilon (e)\), \(M^e_{12} = M^{e}_{21} = \epsilon (e)\), where \(\epsilon(e)\in (0, 1/2)\) for all \(e\in E\). This model may be associated with the Ising model on \(T\) with the order parameter \(\theta (e) = 1 - 2\epsilon(e)\). It is shown that if \(\theta (e) = \theta\) and \(b \theta^2 >1\) (the latter corresponds to the ordered phase of the Ising model), the number of samples \(k\) should satisfy the estimate \(k\geq c_\theta (\log n - \log \delta)\) with a certain \(c_\theta >0\). In the unordered case, this number is shown to have a polynomial lower bound. More general cases of the CFN model are also studied. Reading the paper is rather a hard task.

MSC:

92D15 Problems related to evolution
60K35 Interacting random processes; statistical mechanics type models; percolation theory
60J85 Applications of branching processes
82B26 Phase transitions (general) in equilibrium statistical mechanics
05C05 Trees
05C90 Applications of graph theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Noga Alon and Joel H. Spencer, The probabilistic method, 2nd ed., Wiley-Interscience Series in Discrete Mathematics and Optimization, Wiley-Interscience [John Wiley & Sons], New York, 2000. With an appendix on the life and work of Paul Erdős. · Zbl 0996.05001
[2] A. Ambainis, R. Depser, M. Farach-Colton and S. Kannan (1999) Tight bounds on learnability of evolution, preprint.
[3] S. L. Bezrukov, Isoperimetric problems in discrete spaces, Extremal problems for finite sets (Visegrád, 1991) Bolyai Soc. Math. Stud., vol. 3, János Bolyai Math. Soc., Budapest, 1994, pp. 59 – 91. · Zbl 0818.05061
[4] P. M. Bleher, J. Ruiz, and V. A. Zagrebnov, On the purity of the limiting Gibbs state for the Ising model on the Bethe lattice, J. Statist. Phys. 79 (1995), no. 1-2, 473 – 482. · Zbl 1081.82515 · doi:10.1007/BF02179399
[5] James A. Cavender, Taxonomy with confidence, Math. Biosci. 40 (1978), no. 3-4, 271 – 280. · Zbl 0391.92015 · doi:10.1016/0025-5564(78)90089-5
[6] Thomas M. Cover and Joy A. Thomas, Elements of information theory, Wiley Series in Telecommunications, John Wiley & Sons, Inc., New York, 1991. A Wiley-Interscience Publication. · Zbl 0762.94001
[7] Mary Cryan, Leslie Ann Goldberg, and Paul W. Goldberg, Evolutionary trees can be learned in polynomial time in the two-state general Markov model, SIAM J. Comput. 31 (2001), no. 2, 375 – 397. · Zbl 1052.68061 · doi:10.1137/S0097539798342496
[8] Péter L. Erdős, Michael A. Steel, László A. Székely, and Tandy J. Warnow, A few logs suffice to build (almost) all trees. I, Random Structures Algorithms 14 (1999), no. 2, 153 – 184. , https://doi.org/10.1002/(SICI)1098-2418(199903)14:23.3.CO;2-I · Zbl 0945.60004
[9] Péter L. Erdős, Michael A. Steel, László A. Székely, and Tandy J. Warnow, A few logs suffice to build (almost) all trees. II, Theoret. Comput. Sci. 221 (1999), no. 1-2, 77 – 118. ICALP ’97 (Bologna). · Zbl 0933.68100 · doi:10.1016/S0304-3975(99)00028-6
[10] William Evans, Claire Kenyon, Yuval Peres, and Leonard J. Schulman, Broadcasting on trees and the Ising model, Ann. Appl. Probab. 10 (2000), no. 2, 410 – 433. · Zbl 1052.60076 · doi:10.1214/aoap/1019487349
[11] J. S. Farris (1973) A probability model for inferring evolutionary trees, Syst. Zool., 22, 250-256.
[12] P. Frankl and Z. Füredi, A short proof for a theorem of Harper about Hamming-spheres, Discrete Math. 34 (1981), no. 3, 311 – 313. · Zbl 0482.05002 · doi:10.1016/0012-365X(81)90009-1
[13] Hans-Otto Georgii, Gibbs measures and phase transitions, De Gruyter Studies in Mathematics, vol. 9, Walter de Gruyter & Co., Berlin, 1988. · Zbl 0657.60122
[14] Geoffrey Grimmett, Percolation, 2nd ed., Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 321, Springer-Verlag, Berlin, 1999. · Zbl 0926.60004
[15] L. H. Harper, Optimal numberings and isoperimetric problems on graphs, J. Combinatorial Theory 1 (1966), 385 – 393. · Zbl 0158.20802
[16] Yasunari Higuchi, Remarks on the limiting Gibbs states on a (\?+1)-tree, Publ. Res. Inst. Math. Sci. 13 (1977/78), no. 2, 335 – 348. · Zbl 0376.60096 · doi:10.2977/prims/1195189812
[17] Dmitry Ioffe, On the extremality of the disordered state for the Ising model on the Bethe lattice, Lett. Math. Phys. 37 (1996), no. 2, 137 – 143. · Zbl 0848.60090 · doi:10.1007/BF00416016
[18] S. Janson and E. Mossel (2003). Robust reconstruction on trees is determined by the second eigenvalue, submitted. · Zbl 1061.60105
[19] 42nd IEEE Symposium on Foundations of Computer Science, IEEE Computer Society, Los Alamitos, CA, 2001.
[20] H. Kesten and B. P. Stigum, Additional limit theorems for indecomposable multidimensional Galton-Watson processes, Ann. Math. Statist. 37 (1966), 1463 – 1481. · Zbl 0203.17402 · doi:10.1214/aoms/1177699139
[21] Thomas M. Liggett, Interacting particle systems, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 276, Springer-Verlag, New York, 1985. · Zbl 0559.60078
[22] Russell Lyons, The Ising model and percolation on trees and tree-like graphs, Comm. Math. Phys. 125 (1989), no. 2, 337 – 353. · Zbl 0679.60101
[23] Elchanan Mossel, Recursive reconstruction on periodic trees, Random Structures Algorithms 13 (1998), no. 1, 81 – 97. , https://doi.org/10.1002/(SICI)1098-2418(199808)13:13.3.CO;2-K · Zbl 0959.05112
[24] Elchanan Mossel, Reconstruction on trees: beating the second eigenvalue, Ann. Appl. Probab. 11 (2001), no. 1, 285 – 300. · Zbl 1021.90008 · doi:10.1214/aoap/998926994
[25] E. Mossel (2003) On the impossibility of reconstructing ancestral data and phylogenetic trees, Jour. Comput. Biol., to appear.
[26] E. Mossel and Y. Peres (2003) Information flow on trees, Ann. Appl. Probab., to appear. · Zbl 1050.60082
[27] E. Mossel and M. A. Steel (2003) A phase transition for a random cluster model on phylogenetic trees, submitted. · Zbl 1047.92032
[28] Jerzy Neyman, Molecular studies of evolution: A source of novel statistical problems, Statistical decision theory and related topics (Proc. Sympos., Purdue Univ., Lafayette, Ind., 1970) Academic Press, New York, 1971, pp. 1 – 27.
[29] Yuval Peres, Probability on trees: an introductory climb, Lectures on probability theory and statistics (Saint-Flour, 1997) Lecture Notes in Math., vol. 1717, Springer, Berlin, 1999, pp. 193 – 280. · Zbl 0957.60001 · doi:10.1007/978-3-540-48115-7_3
[30] Frank Spitzer, Markov random fields on an infinite tree, Ann. Probability 3 (1975), no. 3, 387 – 398. · Zbl 0313.60072
[31] M. A. Steel (1994) Recovering a tree from the leaf colourations it generates under a Markov model, App. Math. Lett., 7(2), 19-24. · Zbl 0794.60071
[32] M.A. Steel, L.A. Székely and M. D. Hendy (1994). Reconstructing trees when sequence sites evolve at variable rates, Jour. Comput. Biol., 1, 153-163.
[33] M. A. Steel (2001), My favorite conjecture, http://www.math.canterbury.ac.nz/\(\sim\)mathmas/ conjecture.pdf.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.