×

zbMATH — the first resource for mathematics

Stack-like and queue-like dynamics in recurrent neural networks. (English) Zbl 1084.68097
Summary: What dynamics do Simple Recurrent Networks (SRNs) develop to represent stack-like and queue-like memories? SRNs have been widely used as models in cognitive science. However, they are interesting in their own right as non-symbolic computing devices from the viewpoints of analogue computing and dynamical systems theory. In this paper, SRNs are trained on two prototypical formal languages with recursive structures that need stack-like or queue-like memories for processing, respectively. The evolved dynamics are analysed, then interpreted in terms of simple dynamical systems, and the different ease with which SRNs aquire them is related to the properties of these simple dynamical systems. Within the dynamical systems framework, it is concluded that the stack-like language is simpler than the queue-like language, without making use of arguments from symbolic computation theory.

MSC:
68T05 Learning and adaptive systems in artificial intelligence
68Q45 Formal languages and automata
Software:
LSTM
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] DOI: 10.1080/01690968608404677
[2] Bates E. A., Brain Development and Cognition: A Reader pp 623– (1993)
[3] DOI: 10.1109/72.279181
[4] DOI: 10.1023/A:1023816706954 · Zbl 1040.68093
[5] Bodén, M., Jacobsson, H. and Ziemke, T. ”Evolving context-free language predictors”. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO. pp.1033–1040.
[6] DOI: 10.1080/095400900750060122
[7] Bodén, M., Wiles, J., Tonkes, B. and Blair, A. ”Learning to predict a context-free language: analysis of dynamics in recurrent hidden units”. Proceedings of the 9th International Conference on Artificial Neural Networks (ICANN 99). pp.359–364.
[8] DOI: 10.1162/neco.1996.8.6.1135 · Zbl 05475401
[9] DOI: 10.1016/0304-3975(91)90053-5 · Zbl 0745.68069
[10] DOI: 10.1109/TIT.1956.1056813 · Zbl 0156.25401
[11] Christiansen M. H., Cognitive Sci. 28 pp 417– (1999)
[12] DOI: 10.1207/s15516709cog2302_2 · Zbl 05395897
[13] Christiansen, M. H. and Devlin, J. T. ”Recursive inconsistencies are hard to learn: a connectionist perspective on universal word order correlations”. Proceedings of the 19th Annual Conference of the Cognitive Science Society. Mahwah, NJ. pp.113–118. Lawrence Erlbaum Associates.
[14] DOI: 10.1162/neco.1989.1.3.372
[15] Crutchfield J., Complexity, Entropy, and the Physics of Information, SFI Studies in the Sciences of Complexity pp 223– (1990)
[16] Crutchfield J. P., Information Dynamics pp 45– (1991)
[17] Crutchfield, J. P. ”The calculi of emergence: computation, dynamics, and induction”. Proceedings of the Oji International Seminar: Complex Systems– From Complex Dynamics to Artificial Reality. 1993. pp.11–54. special issue ofPhysika D · Zbl 0860.68046
[18] Ellefson, M. R. and Christiansen, M. H. ”Subjacency constraints without universal grammar: evidence from artificial language learning and connectionist modeling”. Proceedings of the 22nd Annual Conference of the Cognitive Science Society. pp.645–650.
[19] Ellis R., Connectionist Psychology (1999)
[20] DOI: 10.1207/s15516709cog1402_1
[21] Elman J. L., Rethinking Innateness: A Connectionist Perspective on Development (1996)
[22] DOI: 10.1162/neco.1992.4.3.393
[23] Grüning, A. 2003. ”Why verb-initial languages are not frequent”. Leipzig MPI-MIS Preprint Series 10/2003, Max-Planck-Institute for Mathematics in the Sciences (MPI-MIS)
[24] Hammer, B. ”On the generalization of Elman networks”. International Conference on Artificial Neural Networks ’97. Edited by: Gerstner, W., Germond, A. and Hasler, M. pp.404–414. Berlin: Springer.
[25] DOI: 10.1162/08997660360675080 · Zbl 1085.68125
[26] DOI: 10.1162/neco.1997.9.8.1735
[27] Hopcroft J. E., Introduction to Automata Theory, Languages, and Computation (1979) · Zbl 0426.68001
[28] DOI: 10.1162/0899766053630350 · Zbl 1087.68089
[29] DOI: 10.1109/TNN.2003.813837
[30] Jaeger H., Kognitionswissenschaft 5 pp 151– (1996)
[31] Kitchens B. P., Symbolic Dynamics (1998) · Zbl 0892.58020
[32] DOI: 10.1162/089976601300014538 · Zbl 0973.68200
[33] DOI: 10.1109/69.842255 · Zbl 05108575
[34] Li M., An Introduction to Kolmogorov Complexity and its Applications (1997) · Zbl 0866.68051
[35] Lupyan, G. and Christiansen, M. H. ”Case, word order and language learnability: insights from connectionist modeling”. Proceedings of the 24th Annual Conference of the Cognitive Science Society.
[36] DOI: 10.1162/089976698300017359
[37] DOI: 10.1007/BF02478259 · Zbl 0063.03860
[38] Meduna A., Automata and Languages. Theory and Applications (2000) · Zbl 0951.68056
[39] Monaghan, P., Gonitzke, M. and Chater, N. ”Two wrongs make a right: learnibility and word order consistency”. Proceedings of the 25th Annual Conference of the Cognitive Science Society.
[40] DOI: 10.1016/S0304-3975(97)00028-5 · Zbl 0902.68098
[41] Omlin C., A Field Guide to Dynamical Recognizers Networks pp 207– (2001)
[42] Pollack J. B., Machine Learn. 7 pp 227– (1991)
[43] DOI: 10.1162/089976601750399326 · Zbl 1051.68555
[44] Rodriguez P., Advances in Neural Information Processing Systems 10 pp 87– (1998)
[45] DOI: 10.1080/095400999116340
[46] Savitch, W. J. 1992. ”Why it might pay to assume that languages are infinite”. Newsletter 6(4), Centre for Research in Language, UCSD. · Zbl 0942.68597
[47] Savitch W. J., The Formal Complexity of Natural Language (1987)
[48] DOI: 10.1162/089976602320263980 · Zbl 1010.68857
[49] DOI: 10.1006/jcss.1995.1013 · Zbl 0826.68104
[50] DOI: 10.1111/1468-0394.00126 · Zbl 02020532
[51] DOI: 10.1080/10407413.2003.9652751
[52] DOI: 10.1109/TNN.2003.820839
[53] DOI: 10.1162/08997660360675099 · Zbl 1085.68136
[54] Tonkes, B., Blair, A. and Wiles, J. ”A paradox of neural encoders and decoders or why don’t we talk backwards”. Proceedings of the 2nd Asia-Pacific Conference on Simulated Evolution and Learning (SEAL98). Edited by: McKay, B., Yao, X., Newton, C. S., Kim, J.H. and Furuhashi, T. pp.357–364. Berlin and New York: Springer.
[55] van Gelder T., Behav. Brain Sci. 21 pp 615– (1998)
[56] DOI: 10.1142/9789812815354_0002
[57] Visser, I. ”Rules and associations”. Universiteit van Amsterdam. PhD thesis · Zbl 0968.03071
[58] Visser, I., Raijmakers, M. E.J. and Molenaar, P. C.M. 2002. ”On the computational capabilities of symbolic and subsymbolic models”. submitted; see also chapter 2 in Visser
[59] Wiles J., A Field Guide to Dynamical Recurrent Networks pp 129– (2001)
[60] Wiles, J. and Elman, J. ”Learning to count without a counter: a case study of dynamics and activation landscapes in recurrent networks”. Proceedings of the 17th Annual Conference of the Cognitive Science Society. pp.482–487. Cambrigde, MA: MIT Press.
[61] DOI: 10.1162/neco.1990.2.4.490
[62] DOI: 10.1162/neco.1993.5.6.976
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.