×

Learning local transductions is hard. (English) Zbl 1067.68151

Summary: Local deterministic string-to-string transductions arise in natural language processing tasks such as letter-to-sound translation or pronunciation modeling. This class of transductions is a simple generalization of morphisms of free monoids; learning local transductions is essentially the same as inference of certain monoid morphisms. However, learning even a highly restricted class of morphisms, the so-called fine morphisms, leads to intractable problems: deciding whether a hypothesized fine morphism is consistent with observations is an NP-complete problem; and maximizing classification accuracy of the even smaller class of alphabetic substitution morphisms is APX-hard. These theoretical results provide some justification for using the kinds of heuristics that are commonly used for this learning task.

MSC:

68T50 Natural language processing
68T05 Learning and adaptive systems in artificial intelligence
68Q17 Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aho, A.V., Hopcroft, J.E., and Ullman, J.D., 1983, Data Structures and Algorithms, Addison-Wesley Series in Computer Science and Information Processing, Reading, MA: Addison-Wesley. · Zbl 0487.68005
[2] Angluin, D., 1982, ”Inference of reversible languages,” Journal of the ACM 29(3), 741–765. · Zbl 0485.68066 · doi:10.1145/322326.322334
[3] Ausiello, G.,Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A. and Protasi, M., 1999, Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties, Berlin, Germany: Springer. · Zbl 0937.68002
[4] Bakiri, G. and Dietterich, T.G., 2001, ”Constructing high-accuracy letter-to-phoneme rules with machine learning,” pp. 27–44 in Data-Driven Techniques in Speech Synthesis, R.I. Damper, ed., No. 9in Telecommunications Technology and Applications, Boston, MA: Kluwer.
[5] Damper, R.I., Marchand, Y., Adamson, M.J., and Gustafson, K., 1999, ”Evaluating the pronunciation component of text-to-speech systems for English: A performance comparison of different approaches,” Computer Speech and Language 13(2), 155–176. · doi:10.1006/csla.1998.0117
[6] Eilenberg, S., 1974, Automata, Languages, and Machines, Vol. A., New York, NY: Academic Press. · Zbl 0317.94045
[7] Fisher, W.M., 1999, ”A statistical text-to-phone function using Ngrams and rules,” pp. 649–652 in International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ.
[8] García, P. and Vidal, E., 1990, ”Inference of k-testable languages in the strict sense and application to syntactic pattern recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence 12(9), 920–925. · Zbl 05112292 · doi:10.1109/34.57687
[9] Garey, M.R. and Johnson, D.S., 1979, Computers and Intractability: A Guide to the Theory of NP-Completeness, San Francisco, CA: W.H. Freeman. · Zbl 0411.68039
[10] Gildea, D. and Jurafsky, D., 1996, ”Learning bias and phonological-rule induction,” Computational Linguistics 22(4), 497–530.
[11] Gold, E.M., 1967, ”Language identification in the limit,” Information and Control 10(5), 447–474. · Zbl 0259.68032 · doi:10.1016/S0019-9958(67)91165-5
[12] Hyafil, L. and Rivest, R.L., 1976, ”Constructing optimal binary decision trees is NP-complete,” Information Processing Letters 5(1), 15–17. · Zbl 0333.68029 · doi:10.1016/0020-0190(76)90095-8
[13] International Phonetic Association: 1999, Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, Cambridge, U.K.: Cambridge University Press.
[14] Jansche, M., 2001, ”Re-engineering letter-to-sound rules,” pp. 111–117 in Proceedings of the Second Meeting of the North American Chapter of the Association for Computational Linguistics. Pittsburgh, PA.
[15] Jansche, M., 2003, ”Inference of string mappings for language technology,” Ph.D. Thesis, The Ohio State University, Columbus.
[16] Kearns, M.J. and Vazirani, U.V., 1994, An Introduction to Computational Learning Theory. Cambridge, MA: MIT Press, Second printing, 1997.
[17] Kearns, M.J., Schapire, R.E., and Sellie, L.M., 1992, ”Toward efficient agnostic learning,” pp. 341–352 in Proceedings of the 5th Annual Workshop on Computational Learning Theory, Philadelphia. · Zbl 0938.68797
[18] Kruskal, J.B., 1983, ”An overview of sequence comparison,” pp. 1–44 in Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J. Kruskal, eds., Reading, MA: Addison-Wesley Reissued by CSLI Publications, Stanford, CA,1999. · Zbl 0512.68048
[19] Lucassen, J.M. and Mercer, R.L., 1984, ”An information theoretic approach to the automatic determination of phonemic baseforms,” pp. 42.5.1–42.5.4 in International Conference on Acoustics, Speech, and Signal Processing.
[20] McNaughton, R. and Papert, S., 1972, Counter-Free Automata, Cambridge, MA: MIT Press.
[21] Minka, T.P., 2000, ”Empirical risk minimization is an incomplete inductive principle,” http://www.stat.cmu.edu/\(\sim\)minka/papers/erm.html
[22] Mohri, M., 1997, ”Finite-state transducers in language and speech processing,” Computational Linguistics 23(2), 269–311.
[23] Oncina, J., Garcia, P. and Vidal, E., 1993, ”Learning subsequential transducers for pattern recognition interpretation tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence 15(5), 448–458. · Zbl 05111414 · doi:10.1109/34.211465
[24] Papadimitriou, C.H., 1994, Computational Complexity, Reading, MA: Addison-Wesley. · Zbl 0833.68049
[25] Papadimitriou, C.H. and Steiglitz, K., 1998, Combinatorial Optimization: Algorithms and Complexity, Mineola, NY: Dover Publications. Originally published by Prentice Hall, Englewood Cliffs, NJ, 1982. · Zbl 0503.90060
[26] Pitt, L., 1989, ”Inductive inference, DFAs, and computational complexity,” pp. 18–44 in Analogical and Inductive Inference, International Workshop AII’ 89, Reinhardsbrunn Castle, GDR, October 1–6, 1989, Proceedings, Vol. 397 of Lecture Notes in Computer Science, K.P. Jantke, ed., Berlin, Germany: Springer.
[27] Pitt, L. and Warmuth, M.K., 1993, ”The minimum consistent DFA problem cannot be approximated within any polynomial,” Journal of the ACM 40(1), 95–142. · Zbl 0774.68084 · doi:10.1145/138027.138042
[28] Roche, E. and Schabes, Y., eds., 1997, Finite-State Language Processing, Language, Speech and Communication, Cambridge, MA: MIT Press.
[29] Sejnowski, T.J. and Rosenberg, C.R., 1987, ”Parallel networks that learn to pronounce English text,” Complex Systems 1(1), 145–168. · Zbl 0655.68107
[30] Sproat, R., Möbius, B., Maeda, K. and Tzoukermann, E., 1998, ”Multilingual text analysis,” Chapt. 3, pp. 31–87 in Multilingual Text-to-Speech Synthesis: The Bell Labs Approach, R. Sproat, ed., Dordrecht, The Netherlands: Kluwer Academic Publishers.
[31] Valiant, L.G., 1984, ”A theory of the learnable,” Communications of the ACM 27(11), 1134–1142. · Zbl 0587.68077 · doi:10.1145/1968.1972
[32] van den Bosch, A.P.J., 1997, ”Learning to pronounce written words: A study in inductive language learning,” Ph.D. Thesis, Universiteit Maastricht, Maastricht, The Netherlands.
[33] Wagner, R.A. and Fischer, M.J., 1974, ”The string-to-string correction problem,” Journal of the ACM 21(1), 168–173. · Zbl 0278.68032 · doi:10.1145/321796.321811
[34] Weide, R.L., 1998, ”The Carnegie Mellon pronouncing dictionary version 0.6,” {electronic document}, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA. ftp://ftp.cs.cmu.edu/project/fgdata/dict/.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.