×

zbMATH — the first resource for mathematics

A review of recurrent neural networks: LSTM cells and network architectures. (English) Zbl 07164882
Summary: Recurrent neural networks (RNNs) have been widely adopted in research areas concerned with sequential data, such as text, audio, and video. However, RNNs consisting of sigma cells or tanh cells are unable to learn the relevant information of input data when the input gap is large. By introducing gate functions into the cell structure, the long short-term memory (LSTM) could handle the problem of long-term dependencies well. Since its introduction, almost all the exciting results based on RNNs have been achieved by the LSTM. The LSTM has become the focus of deep learning. We review the LSTM cell and its variants to explore the learning capacity of the LSTM cell. Furthermore, the LSTM networks are divided into two broad categories: LSTM-dominated networks and integrated LSTM networks. In addition, their various applications are discussed. Finally, future research directions are presented for LSTM networks.

MSC:
68-XX Computer science
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Süsstrunk, S. (2010). SLIC superpixels. https://www.researchgate.net/publication/44234783_SLIC_superpixels
[2] Altché, F., & Fortelle, A. D. L. (2017). An LSTM network for highway trajectory prediction. In Proceedings of the IEEE 20th International Conference on Intelligent Transportation Systems. Piscataway, NJ: IEEE. ,
[3] Bengio, Y. (2009). Learning deep Architectures for AI. Foundations and Trends in Machine Learning, 2(1), 1-127. , · Zbl 1192.68503
[4] Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157-166. ,
[5] Brahma, S. (2018). Suffix bidirectional long short-term memory. http://www.researchgate.net/publication/325262895_Suffix_Bidirectional_Long_Short-Term_Memory
[6] Britz, D., Goldie, A., Luong, M. T., & Le, Q. (2017). Massive exploration of neural machine translation architectures. arXiv:1703.03906.
[7] Brown, B., Yu, X., & Garverick, S. (2004). Mixed-mode analog VLSI continuous-time recurrent neural network. In Circuits, Signals, and Systems: IASTED International Conference Proceedings.
[8] Carrio, A., Sampedro, C., Rodriguez-Ramos, A., & Campoy, P. (2017). A review of deep learning methods and applications for unmanned aerial Vehicles. Journal of Sensors, 2, 1-13. ,
[9] Chen, T. B., & Soo, V. W. (1996). A comparative study of recurrent neural network architectures on learning temporal sequences. In Proceedings of the IEEE International Conference on Neural Networks. Piscataway, NJ: IEEE. ,
[10] Chen, X., Fang, H., Lin, T. Y., Vedantam, R., Gupta, S., Dollar, P., & Zitnick, C. L. (2015). Microsoft COCO captions: Data collection and evaluation server. arXiv:504.00325v2.
[11] Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., & Yuille, A. (2014). Detect what you can: Detecting and representing objects using holistic models and body parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE. ,
[12] Cheng, M., Xu, Q., Lv, J., Liu, W., Li, Q., & Wang, J. (2016). MS-LSTM: A multi-scale LSTM model for BGP anomaly detection. In Proceedings of the IEEE 24th International Conference on Network Protocols. Piscataway, NJ: IEEE.
[13] Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078v3.
[14] Chung, J., Gulcehre, C., Cho, K. H., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555v1.
[15] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2015). Gated feedback recurrent neural networks. arXiv:1502.02367v1.
[16] Deng, L. (2013). Three classes of deep learning architectures and their applications: A tutorial survey. In APSIPA transactions on signal and information processing. Cambridge: Cambridge University Press.
[17] Dey, R., & Salemt, F. M. (2017). Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the IEEE International Midwest Symposium on Circuits and Systems. Piscataway, NJ: IEEE. ,
[18] Du, X., Zhang, H., Nguyen, H. V., & Han, Z. (2017). Stacked LSTM deep learning model for traffic prediction in vehicle-to-vehicle communication. In Proceedings of the IEEE Vehicular Technology Conference. Piscataway, NJ: IEEE. ,
[19] Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179-211. ,
[20] Fernández, S., Graves, A., & Schmidhuber, J. (2007a). Sequence labelling in structured domains with hierarchical recurrent neural networks. In Proceedings of the 20th International Joint Conference on Artificial Intelligence. San Mateo, CA: Morgan Kaufmann.
[21] Fernández, S., Graves, A., & Schmidhuber, J. (2007b). An application of recurrent neural networks to discriminative keyword spotting. In Proceedings of the International Conference on Artificial Neural Networks (pp. 220-229). Berlin: Springer. ,
[22] Francesconi, E., Frasconi, P., Gori, M., Marinai, S., Sheng, J. Q., Soda, G., & Sperduti, A. (1997). Logo recognition by recursive neural networks. In Proceedings of the International Workshop on Graphics Recognition (pp. 104-117). Berlin: Springer.
[23] Frasconi, P., Gori, M., & Sperduti, A. (1998). A general framework for adaptive processing of data structures. IEEE Transactions on Neural Networks, 9(5), 768-786. ,
[24] Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193-202. , · Zbl 0419.92009
[25] Gallagher, J. C., Boddhu, S. K., & Vigraham, S. (2005). A reconfigurable continuous time recurrent neural network for evolvable hardware applications. In Proceedings of the 2005 IEEE Congress on Evolutionary Computation. Piscataway, NJ: IEEE. ,
[26] Gers, F. (2001). Long short-term memory in recurrent neural networks. PhD diss., Beuth Hochschule für Technik Berlin.
[27] Gers, F. A., & Schmidhuber, J. (2000). Recurrent nets that time and count. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. Piscataway, NJ: IEEE. ,
[28] Gers, F. A., & Schmidhuber, J. (2001). LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans. Neural. Netw., 12(6), 1333-1340. ,
[29] Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451-2471. ,
[30] Gers, F. A., & Schraudolph, N. N. (2002). Learning precise timing with LSTM recurrent networks. Journal of Machine Learning Research, 3, 115-143. · Zbl 1088.68717
[31] Goel, K., Vohra, R., & Sahoo, J. K. (2014). Polyphonic music generation by modeling temporal dependencies using a RNN-DBN. In Proceedings of the International Conference on Artificial Neural Networks. Berlin: Springer. ,
[32] Goller, C., & Kuchler, A. (1996). Learning task-dependent distributed representations by backpropagation through structure. Neural Networks, 1, 347-352.
[33] Graves, A. (2012). Supervised sequence labelling with recurrent neural networks. Berlin: Springer. , · Zbl 1235.68014
[34] Graves, A. (2014). Generating sequences with recurrent neural networks. arXiv:1308.0850v5.
[35] Graves, A., Fernández, S., & Schmidhuber, J. (2007). Multi-dimensional recurrent neural networks. In Proceedings of the International Conference on Artificial Neural Networks. Berlin: Springer.
[36] Graves, A., Wayne, G., & Danihelka, I. (2014). Neural Turing machines. arXiv:1410.5401v2.
[37] Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabskabarwińska, A., & Agapiou, J. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), 471-476. ,
[38] Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5), 602-610. ,
[39] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2016). LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222-2232. ,
[40] Guo, Y., Liu, Y., Georgiou, T., & Lew, M. S. (2017). A review of semantic segmentation using deep neural networks. International Journal of Multimedia Information Retrieval, 2, 1-7.
[41] Han, X., Wu, Z., Jiang, Y. G., & Davis, L. S. (2017). Learning fashion compatibility with bidirectional LSTMs. In Proceedings of the 2017 ACM on Multimedia Conference. New York: ACM. ,
[42] He, T., & Droppo, J. (2016). Exploiting LSTM structure in deep neural networks for speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 5445-5449). Piscataway, NJ: IEEE. ,
[43] Heck, J. C., & Salem, F. M. (2017). Simplified minimal gated unit variations for recurrent neural networks. In Proceedings of the IEEE International Midwest Symposium on Circuits and Systems (pp. 1593-1596). Piscataway, NJ: IEEE. ,
[44] Hochreiter, S. (1991). Untersuchungen zu dynamischen neuronalen Netzen. PhD diss., Technische Universität München.
[45] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. ,
[46] Hsu, W. N., Zhang, Y., Lee, A., & Glass, J. (2016). Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. Cell, 50(1), 395-399.
[47] Irie, K., Tüske, Z., Alkhouli, T., Schlüter, R., & Ney, H. (2016). LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. In Proceedings of the INTERSPEECH (pp. 3519-3523). Red Hook, NY: Curran. ,
[48] Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man and Cybernetics, 4, 364-378. ,
[49] Ivakhnenko, A. G., & Lapa, V. G. (1965). Cybernetic predicting devices. Sacramento, CA: CCM Information Corporation.
[50] Jing, L., Gulcehre, C., Peurifoy, J., Shen, Y., Tegmark, M., Soljačić, M., & Bengio, Y. (2017). Gated orthogonal recurrent units: On learning to forget. arXiv:1706.02761.
[51] Jordan, M. (1986). Attractor dynamics and parallelism in a connectionist sequential machine. In Proceedings of the Annual Conference of the Cognitive Science Society (pp. 531-546). Piscataway, NJ: IEEE.
[52] Jozefowicz, R., Zaremba, W., & Sutskever, I. (2015). An empirical exploration of recurrent network architectures. In Proceedings of the International Conference on International Conference on Machine Learning (pp. 2342-2350). New York: ACM.
[53] Kalchbrenner, N., Danihelka, I., & Graves, A. (2015). Grid long short-term memory. arXiv:1507.01526.
[54] Karpathy, A., Johnson, J., & Li, F. F. (2015). Visualizing and understanding recurrent networks. arXiv:1506.02078.
[55] Khan, S., & Yairi, T. (2018). A review on the application of deep learning in system health management. Mechanical Systems and Signal Processing, 107, 241-265. ,
[56] Kim, J., El-Khamy, M., & Lee, J. (2017). Residual LSTM: Design of a deep recurrent architecture for distant speech recognition. arXiv:1701.03360.
[57] Koutnik, J., Greff, K., Gomez, F., & Schmidhuber, J. (2014). A clockwork RNN. arXiv:1402.3511v1.
[58] Krause, B., Lu, L., Murray, I., & Renals, S. (2016). Multiplicative LSTM for sequence modelling. arXiv:1609.07959.
[59] Krumm, J., & Horvitz, E. (2015). Eyewitness: Identifying local events via space-time signals in Twitter feeds. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York: ACM. ,
[60] LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541-551. ,
[61] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. ,
[62] Li, B., & Sainath, T. N. (2017). Reducing the computational complexity of two-dimensional LSTMs. In Proceedings of the INTERSPEECH (pp. 964-968). Red Hook, NY: Curran. ,
[63] Li, J., Luong, M. T., Dan, J., & Hovy, E. (2015). When are tree structures necessary for deep learning of representations? arXiv:1503.00185v5.
[64] Li, J., Mohamed, A., Zweig, G., & Gong, Y. (2016a). Exploring multidimensional LSTMS for large vocabulary ASR. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4940-4944). Piscataway, NJ: IEEE. ,
[65] Li, J., Mohamed, A., Zweig, G., & Gong, Y. (2016b). LSTM time and frequency recurrence for automatic speech recognition. In Proceedings of the Conference on Automatic Speech Recognition and Understanding (pp. 187-191). Piscataway, NJ: IEEE.
[66] Li, S., Li, W., Cook, C., Zhu, C., & Gao, Y. (2018). Independently recurrent neural network: Building a longer and deeper RNN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5457-5466). Piscataway, NJ: IEEE. ,
[67] Liang, X., Lin, L., Shen, X., Feng, J., Yan, S., & Xing, E. P. (2017). Interpretable structure-evolving LSTM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2171-2184). Piscataway, NJ: IEEE. ,
[68] Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., & Yan, S. (2015). Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(12), 2402. ,
[69] Liang, X., Shen, X., Feng, J., Lin, L., & Yan, S. (2016). Semantic object parsing with graph LSTM. In Proceedings of the European Conference on Computer Vision (pp. 125-143). Berlin: Springer. ,
[70] Liang, X., Shen, X., Xiang, D., Feng, J., Lin, L., & Yan, S. (2016). Semantic object parsing with local-global long short-term memory. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3185-3193). Piscataway, NJ: IEEE. ,
[71] Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., & Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (pp. 740-755). Berlin: Springer. ,
[72] Lipton, Z. C., Berkowitz, J., & Elkan, C. (2015). A critical review of recurrent neural networks for sequence learning. arXiv:1506.000190.
[73] Liu, P., Qiu, X., Chen, J., & Huang, X. (2016). Deep fusion LSTMs for text semantic matching. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (pp. 1034-1043). Stroudsburg, PA: ACL. ,
[74] Liu, P., Qiu, X., & Huang, X. (2016). Modelling interaction of sentence pair with coupled-LSTMs. arXiv:1605.05573.
[75] Liu, Q., Zhou, F., Hang, R., & Yuan, X. (2017). Bidirectional-convolutional LSTM based spectral-spatial feature learning for hyperspectral image classification. Remote Sensing, 9(12), 1330-1339. ,
[76] Mallinar, N., & Rosset, C. (2018). Deep canonically correlated LSTMs. arXiv:1801.05407.
[77] McCarter, G., & Storkey, A. (2007). Air freight image segmentation database.
[78] Miwa, M., & Bansal, M. (2016). End-to-end relation extraction using LSTMs on sequences and tree structures. arXiv:1601.00770.
[79] Moniz, J. R. A., & Krueger, D. (2018). Nested LSTMs. arXiv:1801.10308.
[80] Mozer, M. C., & Das, S. (1993). A connectionist symbol manipulator that discovers the structure of context-free languages. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in neural information processing systems, 5 (pp. 863-870). Cambridge, MA: MIT Press.
[81] Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the International Conference on International Conference on Machine Learning (pp. 807-814). Madison, WI: Omnipress.
[82] Neil, D., Pfeiffer, M., & Liu, S. C. (2016). Phased LSTM: Accelerating recurrent network training for long or event-based sequences. In D. D. Lee, U. von Luxburg, R. Garnett, M. Sugiyama, & I. Guyon (Eds.), Advances in neural information processing systems, 16 (pp. 3882-3890). Red Hook, NY: Curran.
[83] Nie, Y., An, C., Huang, J., Yan, Z., & Han, Y. (2016). A bidirectional LSTM model for question title and body analysis in question answering. In IEEE First International Conference on Data Science in Cyberspace (pp. 307-311). Piscataway, NJ: IEEE. ,
[84] Nina, O., & Rodriguez, A. (2016). Simplified LSTM unit and search space probability exploration for image description. In Proceedings of the International Conference on Information, Communications and Signal Processing (pp. 1-5). Piscataway, NJ: IEEE.
[85] Niu, Z., Zhou, M., Wang, L., Gao, X., & Hua, G. (2017). Hierarchical multimodal LSTM for dense visual-semantic embedding. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1899-1907). Piscataway, NJ: IEEE. ,
[86] Oord, A. V. D., Kalchbrenner, N., & Kavukcuoglu, K. (2016). Pixel recurrent neural networks. arXiv:1601.06759.
[87] Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., & Ward, R. (2015). Deep sentence embedding using the long short-term memory network: Analysis and application to information retrieval. IEEE/ACM Transactions on Audio Speech and Language Processing, 24(4), 694-707. ,
[88] Pearlmutter, B. A. (1989). Learning state space trajectories in recurrent neural networks. Neural Computation, 1(2), 263-269. ,
[89] Peng, Z., Zhang, R., Liang, X., Liu, X., & Lin, L. (2016). Geometric scene parsing with hierarchical LSTM. In Proceedings of the International Joint Conference on Artificial Intelligence (pp. 3439-3445). Palo Alto, CA: AAAI Press.
[90] Plummer, B. A., Wang, L., Cervantes, C. M., Caicedo, J. C., Hockenmaier, J., & Lazebnik, S. (2017). Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. International Journal of Computer Vision, 123(1), 74-93. ,
[91] Pulver, A., & Lyu, S. (2017). LSTM with working memory. In Proceedings of the International Joint Conference on Neural Networks (pp. 845-851). Piscataway, NJ: IEEE. ,
[92] Qu, Z., Haghani, P., Weinstein, E., & Moreno, P. (2017). Syllable-based acoustic modeling with CTC-SMBR-LSTM. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (pp. 173-177). Piscataway, NJ: IEEE. ,
[93] Rahman, L., Mohammed, N., & Azad, A. K. A. (2017). A new LSTM model by introducing biological cell state. In Proceedings of the International Conference on Electrical Engineering and Information Communication Technology (pp. 1-6). Piscataway, NJ: IEEE.
[94] Ranzato, M. A., Szlam, A., Bruna, J., Mathieu, M., Collobert, R., & Chopra, S. (2014). Video (language) modeling: A baseline for generative models of natural videos. arXiv:1412.6604.
[95] Rawat, W., & Wang, Z. (2017). Deep convolutional neural Networks for image classification: A comprehensive review. Neural Computation, 29(9), 1-10. , · Zbl 07066740
[96] Robinson, A. J., & Fallside, F. (1987). The utility driven dynamic error propagation network. Cambridge: University of Cambridge Department of Engineering.
[97] Sainath, T. N., Vinyals, O., Senior, A., & Sak, H. (2015). Convolutional, long short-term memory, fully connected deep neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4580-4584). Piscataway, NJ: IEEE. ,
[98] Sak, H. I., Senior, A., & Beaufays, F. O. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. arXiv:1402.1128v1.
[99] Saleh, K., Hossny, M., & Nahavandi, S. (2018). Intent prediction of vulnerable road users from motion trajectories using stacked LSTM network. In Proceedings of the IEEE International Conference on Intelligent Transportation Systems (pp. 327-332). Piscataway, NJ: IEEE.
[100] Schlag, I., & Schmidhuber, J. (2017). Gated fast weights for on-the-fly neural program generation. Workshop on Meta-Learning, In NIPS Metalearning Workshop.
[101] Schmidhuber, J. (1993). Reducing the ratio between learning complexity and number of time varying variables in fully recurrent nets. In Proceedings of the International Conference on Neural Networks (pp. 460-463). London: Springer. ,
[102] Schmidhuber, J. (2012). Self-delimiting neural networks. arXiv:2012.
[103] Schmidhuber, J., Wierstra, D., Gagliolo, M., & Gomez, F. (2007). Training recurrent networks by evolino. Neural Computation, 19(3), 757-779. , · Zbl 1127.68085
[104] Schneider, N., & Gavrila, D. M. (2013). Pedestrian path prediction with recursive Bayesian filters: A comparative study. In Proceedings of the German Conference on Pattern Recognition (pp. 174-183). Berlin: Springer. ,
[105] Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681. ,
[106] Shabanian, S., Arpit, D., Trischler, A., & Bengio, Y. (2017). Variational Bi-LSTMs. arXiv:1711.05717.
[107] Sharma, P., & Singh, A. (2017). Era of deep neural networks: A review. In Proceedings of the 8th International Conference on Computing, Communication and Networking Technologies (pp. 1-5). Piscataway, NJ: IEEE. ,
[108] Shi, X., Chen, Z., Wang, H., Woo, W. C., Woo, W. C., & Woo, W. C. (2015). Convolutional LSTM Network: A machine learning approach for precipitation nowcasting. In C. Cortes, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in neural information processing systems (pp. 802-810). Red Hook, NY: Curran.
[109] Song, J., Tang, S., Xiao, J., Wu, F., & Zhang, Z. (2016). LSTM-in-LSTM for generating long descriptions of images. Computational Visual Media, 2(4), 1-10. ,
[110] Sperduti, A., & Starita, A. (1997). Supervised neural networks for the classification of structures. IEEE Transactions on Neural Networks, 8(3), 714-735. ,
[111] Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Highway networks. arXiv:1505.00387.
[112] Šter, B. (2013). Selective recurrent neural network. Neural Processing Letters, 38(1), 1-15. ,
[113] Sun, G. (1990). Connectionist pushdown automata that learn context-free grammars. In Proceedings of the International Joint Conference on Neural Networks (pp. 577-580). Piscataway, NJ: IEEE.
[114] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems (pp. 3104-3112). Red Hook, NY: Curran.
[115] Tai, K. S., Socher, R., & Manning, C. D. (2015). Improved semantic representations from tree-structured long short-term memory networks. Computer Science, 5(1), 36.
[116] Teng, Z., & Zhang, Y. (2016). Bidirectional tree-structured LSTM with head lexicalization. arXiv:1611.06788.
[117] Thireou, T., & Reczko, M. (2007). Bidirectional long short-term memory networks for predicting the subcellular localization of eukaryotic proteins. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 4(3), 441-446. ,
[118] Veeriah, V., Zhuang, N., & Qi, G. J. (2015). Differential recurrent neural networks for action recognition. In Proceedings of the IEEE International Conference on Computer Vision (pp. 4041-4049). Piscataway, NJ: IEEE. ,
[119] Vohra, R., Goel, K., & Sahoo, J. K. (2015). Modeling temporal dependencies in data using a DBN-LSTM. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics (pp. 1-4). Piscataway, NJ: IEEE. ,
[120] Wang, J., & Yuille, A. (2015). Semantic part segmentation using compositional model combining shape and appearance. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1788-1797). Piscataway, NJ: IEEE. ,
[121] Wei, H., Zhou, H., Sankaranarayanan, J., Sengupta, S., & Samet, H. (2018). Residual convolutional LSTM for tweet count prediction. In Companion of the Web Conference 2018 (pp. 1309-1316). Geneva: International World Wide Web Conferences Steering Committee. ,
[122] Weiss, G., Goldberg, Y., & Yahav, E. (2018). On the practical computational power of finite precision RNNs for language recognition. arXiv:1805.04908.
[123] Weng, J. J., Ahuja, N., & Huang, T. S. (1993, May). Learning recognition and segmentation of 3D objects from 2D images. In Proceedings of the Fourth International Conference on Computer Vision (pp. 121-128). Piscataway, NJ: IEEE.
[124] Werbos, P. J. (1988). Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1(4), 339-356. ,
[125] Williams, R. J. (1989). Complexity of exact gradient computation algorithms for recurrent neural networks (Technical Report NU-CCS-89-27). Boston: Northeastern University, College of Computer Science.
[126] Wu, H., Zhang, J., & Zong, C. (2016). An empirical exploration of skip connections for sequential tagging. arXiv:1610.03167.
[127] Xie, X., & Shi, Y. (2018). Long-term memory neural Turing machines. Computer Science and Application, 8(1), 49-58. ,
[128] Yamaguchi, K. (2012). Parsing clothing in fashion photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3570-3577). Piscataway, NJ: IEEE. ,
[129] Yang, Y., Dong, J., Sun, X., Lima, E., Mu, Q., & Wang, X. (2017). A CFCC-LSTM model for sea surface temperature prediction. IEEE Geoscience and Remote Sensing Letters, 99, 1-5.
[130] Yao, K., Cohn, T., Vylomova, K., Duh, K., & Dyer, C. (2015). Depth-gated LSTM. arXiv:1508.03790v4.
[131] Yu, B., Xu, Q., & Zhang, P. (2018). Question classification based on MAC-LSTM. In Proceedings of the IEEE Third International Conference on Data Science in Cyberspace. Piscataway, NJ: IEEE. ,
[132] Zaremba, W., & Sutskever, I. (2014). Learning to execute. arXiv:1410.4615.
[133] Zhang, J., Zheng, Y., & Qi, D. (2016). Deep spatio-temporal residual networks for citywide crowd flows prediction. In Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (pp. 1655-1661). Piscataway, NJ: IEEE.
[134] Zhang, X., Lu, L., & Lapata, M. (2015). Tree recurrent neural networks with application to language modeling. arXiv:1511.0006v1.
[135] Zhang, Y., Chen, G., Yu, D., Yao, K., Khudanpur, S., & Glass, J. (2015). Highway long short-term memory RNNs for distant speech recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 5755-5759). Piscataway, NJ: IEEE.
[136] Zhao, R., Wang, J., Yan, R., & Mao, K. (2016). Machine health monitoring with LSTM networks. In Proceedings of the 10th International Conference on Sensing Technology (pp. 1-6). Piscataway, NJ: IEEE. ,
[137] Zhou, C., Sun, C., Liu, Z., & Lau, F. C. M. (2016). A C-LSTM Neural network for text classification. Computer Science, 1(4), 39-44.
[138] Zhou, G., Wu, J., Zhang, C., & Zhou, Z. (2016). Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, 13(3), 226-234. ,
[139] Zhu, G., Zhang, L., Shen, P., & Song, J. (2017). Multimodal gesture recognition using 3D convolution and convolutional LSTM. IEEE Access, 5, 4517-4524. ,
[140] Zhu, X., Sobhani, P., & Guo, H. (2015). Long short-term memory over tree structures. arXiv:1503.04881.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.