zbMATH — the first resource for mathematics

Analyzing ambiguity of context-free grammars. (English) Zbl 1189.68068
Summary: It has been known since 1962 that the ambiguity problem for context-free grammars is undecidable. Ambiguity in context-free grammars is a recurring problem in language design and parser generation, as well as in applications where grammars are used as models of real-world physical structures.
We observe that there is a simple linguistic characterization of the grammar ambiguity problem, and we show how to exploit this by presenting an ambiguity analysis framework based on conservative language approximations. As a concrete example, we propose a technique based on local regular approximations and grammar unfoldings. We evaluate the analysis using grammars that occur in RNA analysis in bioinformatics, and we demonstrate that it is sufficiently precise and efficient to be practically useful.

68Q42 Grammars and rewriting systems
Full Text: DOI
[1] Aho, A. V.; Ullman, J. D.: The theory of parsing, translation and compiling, vol. 1, (1972) · Zbl 0264.68032
[2] Basten, H.: The usability of ambiguity detection methods for context-free grammars, Electronic notes in theoretical computer science 238, No. 5, 35-46 (2009)
[3] Berstel, J.; Boasson, L.: Balanced grammars and their languages, Formal and natural computing, 3-25 (2002) · Zbl 1060.68051 · link.springer.de
[4] Brabrand, C.; Møller, A.; Schwartzbach, M. I.: Dual syntax for XML languages, Information systems 33, No. 4 (2008) · Zbl 1159.68407
[5] Brabrand, C.; Schwartzbach, M. I.: The metafront system: safe and extensible parsing and transformation, Science of computer programming 68, No. 1, 2-20 (2007) · Zbl 1188.68164 · doi:10.1016/j.scico.2005.06.007
[6] Cantor, D. G.: On the ambiguity problem of backus systems, Journal of the ACM 9, No. 4, 477-479 (1962) · Zbl 0114.33003 · doi:10.1145/321138.321145
[7] B.S.N. Cheung, R.C. Uzgalis, Ambiguity in context-free grammars, in: Proc. ACM Symposium on Applied Computing, SAC ’95, 1995
[8] Chomsky, N.; Schützenberger, M. -P.: The algebraic theory of context-free languages, , 118-161 (1963) · Zbl 0148.00804
[9] Christensen, A. S.; Møller, A.; Schwartzbach, M. I.: Precise analysis of string expressions, Lncs 2694, 1-18 (2003) · Zbl 1067.68541 · link.springer.de
[10] Ii, K. Čulik; Cohen, R. S.: LR-regular grammars - an extension of \(LR(k)\) grammars, Journal of computer and system sciences 7, No. 1, 66-96 (1973) · Zbl 0253.68014 · doi:10.1016/S0022-0000(73)80050-9
[11] Dowell, R. D.; Eddy, S. R.: Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction, BMC bioinformatics 5, No. 71 (2004)
[12] Durbin, R.; Eddy, S. R.; Krogh, A.; Mitchison, G.: Biological sequence analysis, (1998) · Zbl 0929.92010
[13] Earley, J.: An efficient context-free parsing algorithm, Communications of the ACM 13, No. 2, 94-102 (1970) · Zbl 0185.43401 · doi:10.1145/362007.362035
[14] Floyd, R. W.: On ambiguity in phrase structure languages, Communications of the ACM 5, No. 10, 526 (1962) · Zbl 0227.68037
[15] Gardner, P. P.; Daub, J.; Tate, J. G.; Nawrocki, E. P.; Kolbe, D. L.; Lindgreen, S.; Wilkinson, A. C.; Finn, R. D.; Griffiths-Jones, S.; Eddy, S. R.; Bateman, A.: Rfam: updates to the RNA families database, Nucleic acids res 37, 136-140 (2009)
[16] Giegerich, R.: Explaining and controlling ambiguity in dynamic programming, Lncs 1848, 46-59 (2000) · Zbl 0964.90050
[17] R. Giegerich, C. Höner zu Siederdissen, Semantics and ambiguity of stochastic RNA family models, 2009 (submitted manuscript)
[18] Giegerich, R.; Meyer, C.; Steffen, P.: A discipline of dynamic programming over sequence data, Science of computer programming 51, No. 3, 215-263 (2004) · Zbl 1067.90157 · doi:10.1016/j.scico.2003.12.005
[19] Gorn, S.: Detection of generative ambiguities in context-free mechanical languages, Journal of the ACM 10, No. 2, 196-208 (1963) · Zbl 0149.12504 · doi:10.1145/321160.321168
[20] Hopcroft, J. E.; Ullman, J. D.: Introduction to automata theory, languages and computation, (1979) · Zbl 0426.68001
[21] Knuth, D. E.: On the translation of languages from left to right, Information and control 8, 607-639 (1965) · Zbl 0231.68027
[22] Knuth, D. E.: A characterization of parenthesis languages, Information and control 11, 269-289 (1967) · Zbl 0196.01703 · doi:10.1016/S0019-9958(67)90564-5
[23] Kuich, W.: Systems of pushdown acceptors and context-free grammars, Elektronische informationsverarbeitung und kybernetik 6, No. 2, 95-114 (1970) · Zbl 0193.32701
[24] S. McPeak, G.C. Necula, 2004. Elkhound: A fast, practical GLR parser generator, in: Proc. 13th International Conference on Compiler Construction, CC ’04 · Zbl 1125.68354
[25] Mohri, M.; Nederhof, M. -J.: Regular approximation of context-free grammars through transformation, Robustness in language and speech technology (2001) · Zbl 0989.68124
[26] A. Møller, dk.brics.automaton–finite-state automata and regular expressions for Java, 2008. http://www.brics.dk/automaton/
[27] Nawrocki, E. P.; Kolbe, D. L.; Eddy, S. R.: Infernal 1.0: inference of RNA alignments, Bioinformatics 25, No. 10, 1335-1337 (2009)
[28] Nederhof, M. -J.: Practical experiments with regular approximation of context-free languages, Computational linguistics 26, No. 1, 17-44 (2000)
[29] Reeder, J.; Reeder, J.; Giegerich, R.: Locomotif: from graphical motif description to RNA motif search, Bioinformatics 23, No. 13, 392-400 (2007)
[30] Reeder, J.; Steffen, P.; Giegerich, R.: Effective ambiguity checking in biosequence analysis, BMC bioinformatics 6, No. 153 (2005)
[31] S. Schmitz, Conservative ambiguity detection in context-free grammars. In: Proc. 34th International Colloquium on Automata, Languages and Programming, ICALP ’07, 2007 · Zbl 1171.68518
[32] Schmitz, S.: An experimental ambiguity detection tool, Science of computer programming (2009) · Zbl 1187.68282
[33] E. Scott, A. Johnstone, S.S. Hussein, Tomita style generalised parsers, Tech. Rep. CSD-TR-00-A, Royal Holloway, University of London, 2000
[34] M. van den Brand, J. Scheerder, J.J. Vinju, E. Visser, Disambiguation filters for scannerless generalized LR parsers, in: Proc. 11th International Conference on Compiler Construction, CC ’02, 2002 · Zbl 1051.68874
[35] E. Visser, Syntax definition for language prototyping, Ph.D. thesis, University of Amsterdam, 1997 · Zbl 0900.68290
[36] Voss, B.; Giegerich, R.; Rehmsmeier, M.: Complete probabilistic analysis of RNA shapes, BMC biology 4, No. 5 (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.