Deep learning with long short-term memory networks for financial market predictions.

*(English)*Zbl 1403.91387Summary: Long short-term memory (LSTM) networks are a state-of-the-art technique for sequence learning. They are less commonly applied to financial time series predictions, yet inherently suitable for this domain. We deploy LSTM networks for predicting out-of-sample directional movements for the constituent stocks of the S&P 500 from 1992 until 2015. With daily returns of 0.46 percent and a Sharpe ratio of 5.8 prior to transaction costs, we find LSTM networks to outperform memory-free classification methods, i.e., a random forest (RAF), a deep neural net (DNN), and a logistic regression classifier (LOG). The outperformance relative to the general market is very clear from 1992 to 2009, but as of 2010, excess returns seem to have been arbitraged away with LSTM profitability fluctuating around zero after transaction costs. We further unveil sources of profitability, thereby shedding light into the black box of artificial neural networks. Specifically, we find one common pattern among the stocks selected for trading – they exhibit high volatility and a short-term reversal return profile. Leveraging these findings, we are able to formalize a rules-based short-term reversal strategy that yields 0.23 percent prior to transaction costs. Further regression analysis unveils low exposure of the LSTM returns to common sources of systematic risk – also compared to the three benchmark models.

##### MSC:

91G80 | Financial applications of other theories |

62M20 | Inference from stochastic processes and prediction |

68T05 | Learning and adaptive systems in artificial intelligence |

PDF
BibTeX
XML
Cite

\textit{T. Fischer} and \textit{C. Krauss}, Eur. J. Oper. Res. 270, No. 2, 654--669 (2018; Zbl 1403.91387)

Full Text:
DOI

**OpenURL**

##### References:

[1] | Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al., (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org, http://tensorflow.org/. |

[2] | Atsalakis, G. S.; Valavanis, K. P., Surveying stock market forecasting techniques - part II: soft computing methods, Expert Systems with Applications, 36, 3, 5932-5941, (2009) |

[3] | Avellaneda, M.; Lee, J.-H., Statistical arbitrage in the US equities market, Quantitative Finance, 10, 7, 761-782, (2010) · Zbl 1194.91196 |

[4] | Baker, M.; Bradley, B.; Wurgler, J., Benchmarks as limits to arbitrage: understanding the low-volatility anomaly, Financial Analysts Journal, 67, 1, 40-54, (2011) |

[5] | Bali, T. G.; Cakici, N.; Whitelaw, R. F., Maxing out: stocks as lotteries and the cross-section of expected returns, Journal of Financial Economics, 99, 2, 427-446, (2011) |

[6] | Bogomolov, T., Pairs trading based on statistical variability of the spread process, Quantitative Finance, 13, 9, 1411-1430, (2013) · Zbl 1281.91119 |

[7] | Boyer, B.; Mitton, T.; Vorkink, K., Expected idiosyncratic skewness, Review of Financial Studies, 23, 1, 169-202, (2010) |

[8] | Breiman, L., Random forests, Machine learning, 45, 1, 5-32, (2001) · Zbl 1007.68152 |

[9] | Britz, D. (2015). Recurrent neural network tutorial, part 4 - Implementing a GRU/LSTM RNN with Python and Theano. http://www.wildml.com/2015/10/recurrent-neural-network-tutorial-part-4-implementing-a-grulstm-rnn-with-python-and-theano/. |

[10] | Carhart, M. M., On persistence in mutual fund performance, The Journal of Finance, 52, 1, 57, (1997) |

[11] | Chollet, F. (2016). Keras. https://github.com/fchollet/keras |

[12] | Clegg, M.; Krauss, C., Pairs trading with partial cointegration, Quantitative Finance, 18, 1, 121-138, (2018) · Zbl 1400.91534 |

[13] | Conrad, J.; Kaul, G., Mean reversion in short-horizon expected returns, Review of Financial Studies, 2, 2, 225-240, (1989) |

[14] | Diebold, F. X.; Mariano, R. S., Comparing predictive accuracy, Journal of Business & Economic Statistics, 13, 3, 253-263, (1995) |

[15] | Dixon, M.; Klabjan, D.; Bang, J. H., Implementing deep neural networks for financial market prediction on the intel xeon phi, Proceedings of the eighth workshop on high performance computational finance, 1-6, (2015) |

[16] | Engelberg, J.; Reed, A. V.; Ringgenberg, M., Short selling risk, Journal of Finance, (2017) |

[17] | Fama, E. F., Efficient capital markets: A review of theory and empirical work, The Journal of Finance, 25, 2, 383-417, (1970) |

[18] | Fama, E. F.; French, K. R., Multifactor explanations of asset pricing anomalies, The Journal of Finance, 51, 1, 55-84, (1996) |

[19] | Frazzini, A.; Pedersen, L. H., Betting against beta, Journal of Financial Economics, 111, 1, 1-25, (2014) |

[20] | Gal, Y.; Ghahramani, Z., A theoretically grounded application of dropout in recurrent neural networks, Proceedings of the 2016 advances in neural information processing systems, 1019-1027, (2016) |

[21] | Gatev, E.; Goetzmann, W. N.; Rouwenhorst, K. G., Pairs trading: performance of a relative-value arbitrage rule, Review of Financial Studies, 19, 3, 797-827, (2006) |

[22] | Gers, F. A.; Schmidhuber, J.; Cummins, F., Learning to forget: continual prediction with LSTM, Neural Computation, 12, 10, 2451-2471, (2000) |

[23] | Giles, C. L.; Lawrence, S.; Tsoi, A. C., Noisy time series prediction using recurrent neural networks and grammatical inference, Machine Learning, 44, 1, 161-183, (2001) · Zbl 0983.68163 |

[24] | Goodfellow, I.; Warde-Farley, D.; Mirza, M.; Courville, A.; Bengio, Y., Maxout networks, Proceedings of the 30th International Conference on Machine Learning, 1319-1327, (2013) |

[25] | Granger, C. W., Strategies for modelling nonlinear time-series relationships, Economic Record, 69, 3, 233-238, (1993) |

[26] | Graves, A. (2013). Generating sequences with recurrent neural networks. CoRR, arXiv preprint arXiv:1308.0850. |

[27] | Graves, A.; Liwicki, M.; FernĂˇndez, S.; Bertolami, R.; Bunke, H.; Schmidhuber, J., A novel connectionist system for unconstrained handwriting recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 5, 855-868, (2009) |

[28] | Graves, A.; Mohamed, A.-r.; Hinton, G., Speech recognition with deep recurrent neural networks, Proceedings of the 2013 IEEE international conference on acoustics, speech and signal processing, 6645-6649, (2013), IEEE |

[29] | Graves, A.; Schmidhuber, J., Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Networks, 18, 5, 602-610, (2005) |

[30] | Green, J.; Hand, J. R.M.; Zhang, X. F., The characteristics that provide independent information about average U.S. monthly stock returns, The Review of Financial Studies, 30, 12, 4389-4436, (2017) |

[31] | Green, J.; Hand, J. R.M.; Zhang, X. F., The supraview of return predictive signals, Review of Accounting Studies, 18, 3, 692-730, (2013) |

[32] | Gregoriou, G. N., Handbook of short selling, (2012), Academic Press Amsterdam and Boston, MA |

[33] | H2O (2016). H2O documentation. http://h2o.ai/docs, http://h2o-release.s3.amazonaws.com/h2o/rel-tukey/4/docs-website/h2o-docs/index.html. |

[34] | Ho, T. K., Random decision forests, Proceedings of the third international conference on document analysis and recognition, 1, 278-282, (1995), IEEE |

[35] | Hochreiter, S.; Schmidhuber, J., Long short-term memory, Neural computation, 9, 8, 1735-1780, (1997) |

[36] | Hong, H.; Sraer, D. A., Speculative betas, The Journal of Finance, 71, 5, 2095-2144, (2016) |

[37] | Huck, N., Pairs selection and outranking: an application to the S&P 100 index, European Journal of Operational Research, 196, 2, 819-825, (2009) |

[38] | Huck, N., Pairs trading and outranking: the multi-step-ahead forecasting case, European Journal of Operational Research, 207, 3, 1702-1716, (2010) |

[39] | Jacobs, H., What explains the dynamics of 100 anomalies?, Journal of Banking & Finance, 57, 65-85, (2015) |

[40] | Jacobs, H.; Weber, M., On the determinants of pairs trading profitability, Journal of Financial Markets, 23, 75-97, (2015) |

[41] | Jegadeesh, N., Evidence of predictable behavior of security returns, The Journal of Finance, 45, 3, 881-898, (1990) |

[42] | Jegadeesh, N.; Titman, S., Returns to buying winners and selling losers: implications for stock market efficiency, The Journal of Finance, 48, 1, 65-91, (1993) |

[43] | Jegadeesh, N.; Titman, S., Overreaction, delayed reaction, and Contrarian profits, Review of Financial Studies, 8, 4, 973-993, (1995) |

[44] | Jha, V., Timing equity quant positions with short-horizon alphas, The Journal of Trading, 11, 3, 53-59, (2016) |

[45] | Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. http://karpathy.github.io/2015/05/21/rnn-effectiveness/. |

[46] | Krauss, C.; Do, X. A.; Huck, N., Deep neural networks, gradient-boosted trees, random forests: statistical arbitrage on the S&P 500, European Journal of Operational Research, 259, 2, 689-702, (2017) · Zbl 1395.91514 |

[47] | Kumar, A., Who gambles in the stock market?, The Journal of Finance, 64, 4, 1889-1933, (2009) |

[48] | LeBaron, B., Some relations between volatility and serial correlations in stock market returns, The Journal of Business, 65, 2, 199-219, (1992) |

[49] | LeCun, Y.; Bengio, Y.; Hinton, G., Deep learning, Nature, 521, 7553, 436-444, (2015) |

[50] | Lee, D. D.; Chan, H.; Faff, R. W.; Kalev, P. S., Short-term Contrarian investing - Is it profitable?... yes and no, Journal of Multinational Financial Management, 13, 4, 385-404, (2003) |

[51] | Lehmann, B. N., Fads, martingales, and market efficiency, The Quarterly Journal of Economics, 105, 1, 1, (1990) |

[52] | Lo, A. W.; MacKinlay, A. C., When are Contrarian profits due to stock market overreaction?, Review of Financial Studies, 3, 2, 175-205, (1990) |

[53] | Maechler, M. (2016). Rmpfr: R MPFR - multiple precision floating-point reliable. R package. https://cran.r-project.org/package=Rmpfr. |

[54] | Malkiel, B. G., A random walk down wall street: the time-tested strategy for successful investing, (2007), WW Norton & Company |

[55] | McKinney, W., Data structures for statistical computing in python, Proceedings of the ninth Python in science conference, 445, 51-56, (2010) |

[56] | Medsker, L., Recurrent neural networks: design and applications, International series on computational intelligence, (2000), CRC-Press |

[57] | Moritz, B.; Zimmermann, T., Deep conditional portfolio sorts: the relation between past and future stock returns, Working paper, (2014), LMU Munich and Harvard University |

[58] | Olah, C. (2015). Understanding LSTM Networks. http://colah.github.io/posts/2015-08-Understanding-LSTMs/. |

[59] | Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O., Scikit-learn: machine learning in python, Journal of Machine Learning Research, 12, 2825-2830, (2011) · Zbl 1280.68189 |

[60] | Peterson, B. G., & Carl, P. (2014). PerformanceAnalytics: Econometric tools for performance and risk analysis. R package, http://CRAN.R-project.org/package=PerformanceAnalytics. |

[61] | Python Software Foundation (2016). Python 3.5.2 documentation. Available at https://docs.python.org/3.5/. |

[62] | R Core Team (2016). R: A language and environment for statistical computing. http://www.R-project.org/. |

[63] | Rad, H.; Low, R. K.Y.; Faff, R., The profitability of pairs trading strategies: distance, cointegration and copula methods, Quantitative Finance, 16, 10, 1541-1558, (2016) |

[64] | Sak, H., Senior, A.W., & Beaufays, F. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR, arXiv preprint arXiv:1402.1128. |

[65] | Schmidhuber, J., Deep learning in neural networks: an overview, Neural Networks, 61, 85-117, (2015) |

[66] | Sermpinis, G.; Theofilatos, K.; Karathanasopoulos, A.; Georgopoulos, E. F.; Dunis, C., Forecasting foreign exchange rates with adaptive neural networks using radial-basis functions and particle swarm optimization, European Journal of Operational Research, 225, 3, 528-540, (2013) · Zbl 1292.91196 |

[67] | Siah, K. W., & Myers, P. (2016). Stock market prediction through technical and public sentiment analysis. http://kienwei.mit.edu/sites/default/files/images/stock-market-prediction.pdf. |

[68] | Takeuchi, L., & Lee, Y.-Y. (2013). Applying deep learning to enhance momentum trading strategies in stocks. Working paper, Stanford University. |

[69] | Tieleman, T.; Hinton, G., Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude, COURSERA: Neural Networks for Machine Learning, 4, 2, 26-30, (2012) |

[70] | Van Der Walt, S.; Colbert, S. C.; Varoquaux, G., The numpy array: a structure for efficient numerical computation, Computing in Science & Engineering, 13, 2, 22-30, (2011) |

[71] | Xiong, R., Nichols, E. P., & Shen, Y. (2015). Deep learning stock volatility with Google domestic trends. arXiv e-prints arXiv:1512.04916. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.