## Counting, generating and sampling tree alignments.(English)Zbl 1346.92048

Botón-Fernández, María (ed.) et al., Algorithms for computational biology. Third international conference, AlCoB 2016, Trujillo, Spain, June 21–22, 2016. Proceedings. Cham: Springer (ISBN 978-3-319-38826-7/pbk; 978-3-319-38827-4/ebook). Lecture Notes in Computer Science 9702. Lecture Notes in Bioinformatics, 53-64 (2016).
Summary: Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by means of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.
 92D20 Protein sequences, DNA sequences 05C90 Applications of graph theory

RNAforester; CONTRAlign
