×

zbMATH — the first resource for mathematics

Algebraic dynamic programming on trees. (English) Zbl 07052073
Summary: Where string grammars describe how to generate and parse strings, tree grammars describe how to generate and parse trees. We show how to extend generalized algebraic dynamic programming to tree grammars. The resulting dynamic programming algorithms are efficient and provide the complete feature set available to string grammars, including automatic generation of outside parsers and algebra products for efficient backtracking. The complete parsing infrastructure is available as an embedded domain-specific language in Haskell. In addition to the formal framework, we provide implementations for both tree alignment and tree editing. Both algorithms are in active use in, among others, the area of bioinformatics, where optimization problems on trees are of considerable practical importance. This framework and the accompanying algorithms provide a beneficial starting point for developing complex grammars with tree- and forest-based inputs.
MSC:
68 Computer science
90 Operations research, mathematical programming
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Bellman, R.; On the theory of dynamic programming; Proc. Natl. Acad. Sci. USA: 1952; Volume 38 ,716-719. · Zbl 0047.13802
[2] ; Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison: Reading, MA, USA 1983; .
[3] Hofacker, I.L.; Fontana, W.; Stadler, P.F.; Bonhoeffer, L.S.; Tacker, M.; Schuster, P.; Fast folding and comparison of RNA secondary structures; Monatshefte für Chemie/Chem. Mon.: 1994; Volume 125 ,167-188.
[4] Lorenz, R.; Bernhart, S.H.; Höner zu Siederdissen, C.; Tafer, H.; Flamm, C.; Stadler, P.F.; Hofacker, I.L.; ViennaRNA Package 2.0; Algorithms Mol. Biol.: 2011; .
[5] Durbin, R.; Eddy, S.R.; Krogh, A.; Mitchison, G.; ; Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids: Cambridge, UK 1998; . · Zbl 0929.92010
[6] Felsenstein, J.; Evolutionary trees from DNA sequences: A maximum likelihood approach; J. Mol. Evol.: 1981; Volume 17 ,368-376.
[7] Chauve, C.; Courtiel, J.; Ponty, Y.; Counting, generating and sampling tree alignments; Proceedings of the International Conference on Algorithms for Computational Biology: ; ,53-64. · Zbl 1346.92048
[8] Zhang, K.; Shasha, D.; Simple fast algorithms for the editing distance between trees and related problems; SIAM J Comput.: 1989; Volume 18 ,1245-1262. · Zbl 0692.68047
[9] Jacox, E.; Chauve, C.; Szöllősi, G.J.; Ponty, Y.; Scornavacca, C.; ecceTERA: Comprehensive gene tree-species tree reconciliation using parsimony; Bioinformatics: 2016; Volume 32 ,2056-2058.
[10] Schirmer, S.; Ponty, Y.; Giegerich, R.; Introduction to RNA secondary structure comparison; RNA Sequence, Structure, and Function: Computational and Bioinformatic Methods: Totowa, NJ, USA 2014; ,247-273.
[11] Rinaudo, P.; Ponty, Y.; Barth, D.; Denise, A.; Tree decomposition and parameterized algorithms for RNA structure-sequence alignment including tertiary interactions and pseudoknots; Proceedings of the 12th Workshop on Algorithms in Bioinformatics (WABI 2012): ; ,149-164.
[12] Giegerich, R.; Touzet, H.; Modeling Dynamic Programming Problems over Sequences and Trees with Inverse Coupled Rewrite Systems; Algorithms: 2014; Volume 7 ,62-144. · Zbl 07042196
[13] Cantor, D.G.; On the ambiguity problem of Backus systems; J. ACM: 1962; Volume 9 ,477-479. · Zbl 0114.33003
[14] Floyd, R.W.; On ambiguity in phrase structure languages; Commun. ACM: 1962; Volume 5 ,526-534. · Zbl 0227.68037
[15] Brabrand, C.; Giegerich, R.; Møller, A.; Analyzing ambiguity of context-free grammars; Sci. Comput. Program.: 2010; Volume 75 ,176-191. · Zbl 1189.68068
[16] Giegerich, R.; A systematic approach to dynamic programming in bioinformatics; Bioinformatics: 2000; Volume 16 ,665-677.
[17] Höner zu Siederdissen, C.; Hofacker, I.L.; Stadler, P.F.; Product Grammars for Alignment and Folding; IEEE/ACM Trans. Comput. Biol. Bioinform.: 2015; Volume 12 ,507-519.
[18] Höner zu Siederdissen, C.; Sneaking around concatMap: Efficient combinators for dynamic programming; Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, ICFP ’12: New York, NY, USA 2012; Volume Volume 47 ,215-226. · Zbl 1291.68122
[19] Sauthoff, G.; Janssen, S.; Giegerich, R.; Bellman’s GAP—A Declarative Language for Dynamic Programming; Proceedings of the 13th International ACM SIGPLAN Symposium on Principles and Practices of Declarative Programming, PPDP ’11: ; ,29-40.
[20] Sauthoff, G.; Möhl, M.; Janssen, S.; Giegerich, R.; Bellman’s GAP—A Language and Compiler for Dynamic Programming in Sequence Analysis; Bioinformatics: 2013; Volume 29 ,551-560.
[21] Brainerd, W.S.; Tree generating regular systems; Inf. Control: 1969; Volume 14 ,217-231. · Zbl 0169.31601
[22] Giegerich, R.; Code selection by inversion of order-sorted derivors; Theor. Comput. Sci.: 1990; Volume 73 ,177-211. · Zbl 0701.68018
[23] Giegerich, R.; Meyer, C.; Steffen, P.; A Discipline of Dynamic Programming over Sequence Data; Sci. Comput. Program.: 2004; Volume 51 ,215-263. · Zbl 1067.90157
[24] Höner zu Siederdissen, C.; Prohaska, S.J.; Stadler, P.F.; Algebraic Dynamic Programming over General Data Structures; BMC Bioinform.: 2015; Volume 16 .
[25] Giegerich, R.; Meyer, C.; Algebraic Dynamic Programming; Algebraic Methodology and Software Technology: Berlin/Heidelberg, Germany 2002; Volume Volume 2422 ,243-257.
[26] Needleman, S.B.; Wunsch, C.D.; A general method applicable to the search for similarities in the amino acid sequence of two proteins; J. Mol. Biol.: 1970; Volume 48 ,443-453.
[27] Höner zu Siederdissen, C.; Hofacker, I.L.; Stadler, P.F.; How to Multiply Dynamic Programming Algorithms; Brazilian Symposium on Bioinformatics (BSB 2013): Heidelberg, Germany 2013; Volume Volume 8213 ,82-93.
[28] Höner zu Siederdissen, C.; Prohaska, S.J.; Stadler, P.F.; Dynamic Programming for Set Data Types; Brazilian Sympositum on Bioinformatics (BSB 2014): Heidelberg, Germany 2014; Volume Volume 8826 ,57-64.
[29] Riechert, M.; Höner zu Siederdissen, C.; Stadler, P.F.; Algebraic Dynamic Programming for Multiple Context-Free Languages; Theor. Comput. Sci.: 2016; Volume 639 ,91-109. · Zbl 1344.68112
[30] Chen, W.; New algorithm for ordered tree-to-tree correction problem; J. Algorithms: 2001; Volume 40 ,135-158. · Zbl 0980.68142
[31] Schwarz, S.; Pawlik, M.; Augsten, N.; A New Perspective on the Tree Edit Distance; Proceedings of the International Conference on Similarity Search and Applications: ; ,156-170.
[32] Tai, K.C.; The tree-to-tree correction problem; J. ACM (JACM): 1979; Volume 26 ,422-433. · Zbl 0409.68040
[33] Nussinov, R.; Jacobson, A.B.; Fast algorithm for predicting the secondary structure of single-stranded RNA; Proc. Natl. Acad. Sci. USA: 1980; Volume 77 ,6309-6313.
[34] Sankoff, D.; Minimal mutation trees of sequences; SIAM J. Appl. Math.: 1975; Volume 28 ,35-42. · Zbl 0315.05101
[35] Fitch, W.M.; Towards defining the course of evolution: Minimum change for a specific tree topology; Syst. Biol.: 1971; Volume 20 ,406-416.
[36] Hartigan, J.A.; Minimum mutation fits to a given tree; Biometrics: 1973; Volume 29 ,53-65.
[37] Maddison, W.P.; Testing Character Correlation using Pairwise Comparisons on a Phylogeny; J. Theor. Biol.: 2000; Volume 202 ,195-204.
[38] Arnold, C.; Nunn, C.L.; Phylogenetic Targeting of Research Effort in Evolutionary Biology; Am. Nat.: 2010; Volume 176 ,601-612.
[39] Arnold, C.; Stadler, P.F.; Polynomial algorithms for the Maximal Pairing Problem: Efficient phylogenetic targeting on arbitrary trees; Algorithms Mol. Biol.: 2010; Volume 5 ,25.
[40] Selkow, S.M.; The tree-to-tree editing problem; Inf. Process. Lett.: 1977; Volume 6 ,184-186. · Zbl 0391.68022
[41] Jiang, T.; Wang, L.; Zhang, K.; Alignment of trees - an alternative to tree edit; Theor. Comput. Sci.: 1995; Volume 143 ,137-148. · Zbl 0873.68150
[42] Schirmer, S.; Comparing Forests; Ph.D. Thesis: Bielefeld, Germany 2011; .
[43] Schirmer, S.; Giegerich, R.; Forest alignment with affine gaps and anchors; Combinatorial Pattern Matching: Berlin/Heidelberg, Germany 2011; ,104-117. · Zbl 1339.68212
[44] Höchsmann, M.; The Tree Alignment Model: Algorithms, Implementations and Applications for the Analysis of RNA Secondary Structures; Ph.D. Thesis: Bielefeld, Germany 2005; .
[45] Gotoh, O.; An improved algorithm for matching biological sequences; J. Mol. Biol.: 1982; Volume 162 ,705-708.
[46] Reese, J.T.; Pearson, W.R.; Empirical determination of effective gap penalties for sequence comparison; Bioinformatics: 2002; Volume 18 ,1500-1507.
[47] Pawlik, M.; Augsten, N.; Tree edit distance: Robust and memory-efficient; Inf. Syst.: 2016; Volume 56 ,157-173.
[48] Bringmann, K.; Gawrychowski, P.; Mozes, S.; Weimann, O.; Tree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can); Proceedings of the SODA 2018: ; . · Zbl 1403.68368
[49] Bille, P.; A survey on tree edit distance and related problems; Theor. Comput. Sci.: 2005; Volume 337 ,217-239. · Zbl 1078.68152
[50] Dulucq, S.; Tichit, L.; RNA secondary structure comparison: Exact analysis of the Zhang-Shasha tree edit algorithm; Theor. Comput. Sci.: 2003; Volume 306 ,471-484. · Zbl 1060.68027
[51] Kan, T.; Higuchi, S.; Hirata, K.; Segmental mapping and distance for rooted labeled ordered trees; Fundam. Inform.: 2014; Volume 132 ,461-483. · Zbl 1302.05126
[52] Kuboyama, T.; Matching and Learning in Trees; Ph.D. Thesis: Tokyo, Japan 2007; .
[53] Keller, G.; Chakravarty, M.M.; Leshchinskiy, R.; Peyton Jones, S.; Lippmeier, B.; Regular, Shape-polymorphic, Parallel Arrays in Haskell; Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, ICFP ’10: ; ,261-272. · Zbl 1323.68127
[54] Coutts, D.; Leshchinskiy, R.; Stewart, D.; Stream Fusion: From Lists to Streams to Nothing at All; Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07: ; ,315-326.
[55] Peyton Jones, S.; Call-pattern Specialisation for Haskell Programs; Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07: ; ,327-337. · Zbl 1291.68130
[56] Mainland, G.; Why It’s Nice to be Quoted: Quasiquoting for Haskell; Proceedings of the ACM SIGPLAN Workshop on Haskell Workshop: ; ,73-82.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.