×

Bayesian phylogenetic inference via Markov chain Monte Carlo methods. (English) Zbl 1059.62675

Summary: We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees. A transformation of the tree into a canonical cophenetic matrix form suggests a simple and effective proposal distribution for selecting candidate trees close to the current tree in the chain. We illustrate the algorithm with restriction site data on 9 plant species, then extend to DNA sequences from 32 species of fish. The algorithm mixes well in both examples from random starting trees, generating reproducible estimates and credible sets for the path of evolution.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
92D15 Problems related to evolution
65C40 Numerical analysis or methods applied to Markov chains
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Besag J., Journal of the Royal Statistical Society, Series B 55 pp 25–
[2] Camin J. H., Evolution 19 pp 311–
[3] Cowles M. K., Journal of the American Statistical Society 91 pp 883– · doi:10.1080/01621459.1996.10476956
[4] Evans S., Annals of Statistics 21 pp 355–
[5] Felsenstein J., Systematic Zoology 27 pp 27–
[6] Felsenstein J., Journal of Molecular Evolution 17 pp 368– · Zbl 0528.62090
[7] Felsenstein J., Journal of the Royal Statistical Society, Series A 146 pp 246–
[8] Felsenstein J., Evolution 39 pp 783– (1985)
[9] DOI: 10.1086/284325 · doi:10.1086/284325
[10] Felsenstein J., Annual Review of Genetics 22 pp 1– · Zbl 0528.62090
[11] Felsenstein J., Evolution 46 pp 159–
[12] Fitch W. M., Science 155 pp 279–
[13] Geyer C. J., Statistical Science 7 pp 437–
[14] Goldman N., Systematic Zoology 39 pp 345–
[15] Goldman N., Journal of Molecular Evolution 36 pp 182–
[16] DOI: 10.1006/tpbi.1994.1023 · Zbl 0807.92015 · doi:10.1006/tpbi.1994.1023
[17] Griffiths R. C., Statistical Science 9 pp 307– (1994)
[18] Hasegawa M., Evolution 43 pp 672–
[19] Hasegawa M., Journal of Molecular Evolution 22 pp 160–
[20] Huelsenbeck J. P., Science 276 pp 227–
[21] Kocher T. D., Molecular Phylogenetics and Evolution 4 pp 420–
[22] Kuhner M. K., Progress in Population Genetics and Human Evolution 87 pp 183– · doi:10.1007/978-1-4757-2609-1_11
[23] Lapointe F.-J., Journal of Classification 8 pp 177– · Zbl 0825.62542
[24] Lapointe F.-J., Systematic Biology 41 pp 158–
[25] S. Li, D. K. Pearl, and H. Doss (). Phylogenetic tree construction using Markov chain Monte Carlo . Technical Report 583, Department of Statistics, Ohio State University, Columbus.
[26] Mau B., Journal of Computational and Graphical Statistics 6 pp 122–
[27] Navidi W. C., Biometrics 49 pp 543–
[28] Newton M. A., Biometrika 83 pp 315–
[29] M. A. Newton, B. Mau, B. Larget, F. Seillier-Moiseiwitsch, P. Donnelly, and M. Waterman (). Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences . InProceedings of the AMS-IMS-SIAM Joint Summer Research Conference on Statistics and Molecular Biology, Seattle, Washington, in press.
[30] Rambaut A., CABIOS 13 pp 235–
[31] Schoniger M., Molecular Phylogenetics and Evolution 3 pp 240–
[32] Sinsheimer J. S., Biometrics 52 pp 193–
[33] Smouse P. E., Evolution 43 pp 1162–
[34] Sokal R. R., Numerical Taxonomy · doi:10.1038/scientificamerican1266-106
[35] Swofford D. L., Molecular Systernatics pp 407–, 2. ed.
[36] Sytsma K. J., Evolution 40 pp 1248–
[37] S. Tavare, and Y. Feng (). Reconstructing phylogenetic trees when sites are dependent . DIMACS Technical Report 95-48, 55 -57 , Rutgers University, Piscataway, New Jersey.
[38] Tierney L., Annals of Statistics 22 pp 1701–
[39] Yang Z., TREE 11 pp 367–
[40] Yang Z., Molecular Biology and Evolution 14 pp 717– · doi:10.1093/oxfordjournals.molbev.a025811
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.