×

Symbolic state transducers and recurrent neural preference machines for text mining. (English) Zbl 1026.68106

Summary: This paper focuses on symbolic transducers and recurrent neural preference machines to support the task of mining and classifying textual information. These encoding symbolic transducers and learning neural preference machines can be seen as independent agents, each one tackling the same task in a different manner. Systems combining such machines can potentially be more robust as the strengths and weaknesses of the different approaches yield complementary knowledge, wherein each machine models the same information content via different paradigms. An experimental analysis of the performance of these symbolic transducer and neural preference machines is presented. It is demonstrated that each approach can be successfully used for information mining and news classification using the Reuters news corpus. Symbolic transducer machines can be used to manually encode relevant knowledge quickly in a data-driven approach with no training, while trained neural preference machines can give better performance based on additional training.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T50 Natural language processing
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] N.M. Allinson, H. Yin, Interactive and semantic data visualisation using self-organizing maps, in: Proceedings of the IEE Colloquium on Neural Networks in Interactive Multimedia Systems, 1998; N.M. Allinson, H. Yin, Interactive and semantic data visualisation using self-organizing maps, in: Proceedings of the IEE Colloquium on Neural Networks in Interactive Multimedia Systems, 1998
[2] M. Balabanovic, Y. Shoham, Learning information retrieval agents: experiments with automated web browsing, in: Proceedings of the 1995 AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments, Stanford, CA, 1995; M. Balabanovic, Y. Shoham, Learning information retrieval agents: experiments with automated web browsing, in: Proceedings of the 1995 AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments, Stanford, CA, 1995
[3] M. Balabanovic, Y. Shoham, Y. Yun, An adaptive agent for automated web browsing, Technical Report CS-TN-97-52, Stanford University, 1997; M. Balabanovic, Y. Shoham, Y. Yun, An adaptive agent for automated web browsing, Technical Report CS-TN-97-52, Stanford University, 1997
[4] T. Briscoe, Co-evolution of language and of the language acquisition device, in: Proceedings of the Meeting of the Association for Computational Linguistics, 1997; T. Briscoe, Co-evolution of language and of the language acquisition device, in: Proceedings of the Meeting of the Association for Computational Linguistics, 1997
[5] Charniak, E., Statistical Language Learning (1993), MIT Press: MIT Press Cambridge, MA
[6] Cleeremans, A.; Servan-Schreiber, D.; McClelland, J., Finite-state automata and simple recurrent networks, Neural Computation, 1, 372-381 (1989)
[7] R. Cooley, B. Mobasher, J. Srivastava, Web mining: information and pattern discovery on the world wide web, in: International Conference on Tools for Artificial Intelligence, Newport Beach, CA, November 1997; R. Cooley, B. Mobasher, J. Srivastava, Web mining: information and pattern discovery on the world wide web, in: International Conference on Tools for Artificial Intelligence, Newport Beach, CA, November 1997
[8] M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery, Learning to extract symbolic knowledge from the world wide web, in: Proceedings of the 15th National Conference on Artificial Intelligence, Madison, WI, 1998; M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, S. Slattery, Learning to extract symbolic knowledge from the world wide web, in: Proceedings of the 15th National Conference on Artificial Intelligence, Madison, WI, 1998 · Zbl 0939.68745
[9] H. Cunningham, Y. Wilks, R. Gaizauskas, New methods, current trends and software infrastructure for NLP, in: Proceedings of the NEMLAP-2, Ankara, 1996; H. Cunningham, Y. Wilks, R. Gaizauskas, New methods, current trends and software infrastructure for NLP, in: Proceedings of the NEMLAP-2, Ankara, 1996
[10] J.L. Elman, Finding structure in time, Technical Report CRL 8901, University of California, San Diego, CA, 1988; J.L. Elman, Finding structure in time, Technical Report CRL 8901, University of California, San Diego, CA, 1988
[11] Freitag, D., Information extraction from html: application of a general machine learning approach, (National Conference on Artificial Intelligence, Madison, WI (1998)), 517-523
[12] C. Lee Giles, B.G. Horne, T. Lin, Learning a class of large finite state machines with a recurrent neural network, Technical Report UMIACS-TR-94-94, NEC Research Institute, Princeton, NJ, August 1994; C. Lee Giles, B.G. Horne, T. Lin, Learning a class of large finite state machines with a recurrent neural network, Technical Report UMIACS-TR-94-94, NEC Research Institute, Princeton, NJ, August 1994
[13] Honkela, T., Self-organizing maps in symbol processing, (Wermter, S.; Sun, R., Hybrid Neural Systems (2000), Springer: Springer Heidelberg, Germany)
[14] T. Joachims, Text categorization with support vector machines: learning with many relevant features, in: Proceedings of the European Conference on Machine Learning, Chemnitz, Germany, 1998; T. Joachims, Text categorization with support vector machines: learning with many relevant features, in: Proceedings of the European Conference on Machine Learning, Chemnitz, Germany, 1998
[15] Jordan, M. I., Attractor dynamics and parallelism in a connectionist sequential machine, (Proceedings of the Eighth Conference of the Cognitive Science Society, Amherst, MA (1986)), 531-546
[16] Kaski, S.; Honkela, T.; Lagus, K.; Kohonen, T., WEBSOM - self-organizing maps of document collections, Neurocomputing, 21, 101-117 (1998) · Zbl 0917.68059
[17] Kohonen, T., Self-Organizing Maps (1995), Springer: Springer Berlin
[18] Kohonen, T., Self-organisation of very large document collections: state of the art, (Proceedings of the International Conference on Aritificial Neural Networks, Skovde, Sweden (1998)), 65-74
[19] Kremer, S. C., On the computational power of Elman-style recurrent networks, IEEE Transactions on Neural Networks, 6, 4, 1000-1004 (1995)
[20] Yann le Cun, Une procédure d’apprentissage pour réseau à seuil assymétrique, in: Cognitiva 85: A la Frontière de l’Intelligence Artificielle des Sciences de la Connaissance des Neurosciences, Paris, CESTA, 1985, pp. 599-604; Yann le Cun, Une procédure d’apprentissage pour réseau à seuil assymétrique, in: Cognitiva 85: A la Frontière de l’Intelligence Artificielle des Sciences de la Connaissance des Neurosciences, Paris, CESTA, 1985, pp. 599-604
[21] D.D. Lewis, Reuters-21578 text categorization test collection, 1997. Available from http://www.research.att.com/∼lewis; D.D. Lewis, Reuters-21578 text categorization test collection, 1997. Available from http://www.research.att.com/∼lewis
[22] F. Menczer, R. Belew, W. Willuhn, Artificial life applied to adaptive information agents, in: Proceedings of the 1995 AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments, 1995; F. Menczer, R. Belew, W. Willuhn, Artificial life applied to adaptive information agents, in: Proceedings of the 1995 AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments, 1995
[23] Niki, K., Self-organizing information retrieval system on the web: SirWeb, (Kasabov, N.; Kozma, R.; Ko, K.; O’Shea, R.; Coghill, G.; Gedeon, T., Progress in Connectionist-Based Information Systems. Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, vol. 2 (1997), Springer: Springer Singapore), 881-884
[24] C.W. Omlin, C. Lee Giles, Constructing deterministic finite-state automata in recurrent neural networks, Technical Report 94-3, Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY 12180, 1994; C.W. Omlin, C. Lee Giles, Constructing deterministic finite-state automata in recurrent neural networks, Technical Report 94-3, Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY 12180, 1994 · Zbl 0883.68105
[25] Papka, R.; Callan, J. P.; Barto, A. G., Text-based information retrieval using exponentiated gradient descent, (Mozer, M. C.; Jordan, M. I.; Petsche, T., Advances in Neural Information Processing Systems, vol. 9 (1997), MIT Press: MIT Press Cambridge, MA)
[26] D.B. Parker, Learning-logic, Technical Report TR-47, Sloan School of Management, MIT, Cambridge, MA, 1985; D.B. Parker, Learning-logic, Technical Report TR-47, Sloan School of Management, MIT, Cambridge, MA, 1985
[27] M. Perkowitz, O. Etzioni, Adaptive web sites: an AI challenge, in: International Joint Conference on Artificial Intelligence, Nagoya, Japan, 1997; M. Perkowitz, O. Etzioni, Adaptive web sites: an AI challenge, in: International Joint Conference on Artificial Intelligence, Nagoya, Japan, 1997 · Zbl 0938.68514
[28] Rumelhart, D. E.; Hinton, G. E.; Williams, R. J., Learning internal representations by error propagation, (Rumelhart, D. E.; McClelland, J. L., Parallel Distributed Processing, vol. 1 (1986), MIT Press: MIT Press Cambridge, MA), 318-362
[29] M. Sahami, M. Hearst, E. Saund, Applying the multiple cause mixture model to text categorization, Technical Report, AAAI Spring Symposium on Machine Learning in Information Access, 1996; M. Sahami, M. Hearst, E. Saund, Applying the multiple cause mixture model to text categorization, Technical Report, AAAI Spring Symposium on Machine Learning in Information Access, 1996
[30] Salton, G., Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer (1989), Addison-Wesley: Addison-Wesley Reading, MA
[31] H. Schuetze, D.A. Hull, J.O. Pedersen, A comparison of classifiers and document representations for the routing problem, in: Proceedings of the Special Interest Group on Information Retrieval, 1995; H. Schuetze, D.A. Hull, J.O. Pedersen, A comparison of classifiers and document representations for the routing problem, in: Proceedings of the Special Interest Group on Information Retrieval, 1995
[32] D. Servan-Schreiber, A. Cleeremans, J.L. McClelland, Encoding sequential structure in simple recurrent networks, Technical Report CMU-CS-88-183, Carnegie Mellon University, Pittsburgh, PA, 1988; D. Servan-Schreiber, A. Cleeremans, J.L. McClelland, Encoding sequential structure in simple recurrent networks, Technical Report CMU-CS-88-183, Carnegie Mellon University, Pittsburgh, PA, 1988
[33] Sharkey, N.; Sharkey, A., Separating learning and representation, (Wermter, S.; Riloff, E.; Scheler, G., Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing (1996), Springer: Springer Berlin), 17-32
[34] Sun, R.; Peterson, T., Multi-agent reinforcement learning: weighting and partitioning, Neural Networks (1999)
[35] van Noord, G., FSA utilities: a toolbox to manipulate finite-state automata, (Raymond, D.; Wood, D.; Yu, S., Automata Implementation. Automata Implementation, Lecture Notes in Computer Science, vol. 1260 (1997), Springer: Springer New York), 87-108
[36] P.J. Werbos, Beyond regression: new tools for regression and analysis in the behavioral sciences, Ph.D. Thesis, Harvard University, Division of Engineering and Applied Physics, 1974; P.J. Werbos, Beyond regression: new tools for regression and analysis in the behavioral sciences, Ph.D. Thesis, Harvard University, Division of Engineering and Applied Physics, 1974
[37] S. Wermter, Hybrid Connectionist Natural Language Processing, Chapman & Hall, Thomson International, London, UK, 1995; S. Wermter, Hybrid Connectionist Natural Language Processing, Chapman & Hall, Thomson International, London, UK, 1995
[38] Wermter, S., Preference Moore machines for neural fuzzy integration, (Proceedings of the International Joint Conference on Artificial Intelligence, Stockholm (1999)), 840-845
[39] Wermter, S., Neural fuzzy preference integration using neural preference moore machines, International Journal of Neural Systems, 10, 4, 287-309 (2000) · Zbl 1060.68627
[40] Wermter, S.; Arevian, G.; Panchev, C., Recurrent neural network learning for text routing, (Proceedings of the International Conference on Artificial Neural Networks, Edinburgh, UK (1999)), 898-903
[41] Wermter, S.; Arevian, G.; Panchev, C., Network analysis in a neural learning internet agent, (Proceedings of the International Conference on Computational Intelligence and Neurosciences, Atlantic City, PA, USA (2000)), 880-884
[42] Wermter, S.; Panchev, C.; Arevian, G., Hybrid neural plausibility networks for news agents, (Proceedings of the National Conference on Artificial Intelligence, Orlando, USA (1999)), 93-98
[43] Wermter, S.; Sun, R., Hybrid Neural Systems (2000), Springer: Springer Heidelberg
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.