zbMATH — the first resource for mathematics

The shape of the one-dimensional phylogenetic likelihood function. (English) Zbl 1370.05039
Summary: By fixing all parameters in a phylogenetic likelihood model except for one branch length, one obtains a one-dimensional likelihood function. In this work, we introduce a mathematical framework to characterize the shapes of such one-dimensional phylogenetic likelihood functions. This framework is based on analyses of algebraic structures on the space of all frequency patterns with respect to a polynomial representation of the likelihood functions. Using this framework, we provide conditions under which the one-dimensional phylogenetic likelihood functions are guaranteed to have at most one stationary point, and this point is the maximum likelihood branch length. These conditions are satisfied by common simple models including all binary models, the Jukes-Cantor model [T. H. Jukes and C. R. Cantor, “Evolution of protein molecules”, Mammalian Protein Metabolism 3, 21–132 (1969)] and the Felsenstein model [J. Felsenstein, “Evolutionary trees from DNA sequences: a centennial retrospective”, J. Mol. Evol. 17, 368–376 (1981)].
We then prove that for the simplest model that does not satisfy our conditions, namely, the Kimura 2-parameter model, the one-dimensional likelihood functions may have multiple stationary points. As a proof of concept, we construct a nondegenerate example in which the phylogenetic likelihood function has two local maxima and a local minimum. To construct such examples, we derive a general method of constructing a tree and sequence data with a specified frequency pattern at the root. We then extend the result to prove that the space of all rescaled and translated one-dimensional phylogenetic likelihood functions under the Kimura 2-parameter model is dense in the space of all nonnegative continuous functions on $$[0,\infty)$$ with finite limits. These results indicate that one-dimensional likelihood functions under advanced evolutionary models can be more complex than it is typically assumed by phylogenetic inference algorithms; however, these complexities can be effectively captured by the Kimura 2-parameter model.
MSC:
 05C05 Trees 05C90 Applications of graph theory 92B10 Taxonomy, cladistics, statistics in mathematical biology 05C25 Graphs and abstract algebra (groups, rings, fields, etc.) 92D15 Problems related to evolution
Full Text: