Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications. (English) Zbl 1496.93112

Summary: This paper presents several numerical applications of deep learning-based algorithms for discrete-time stochastic control problems in finite time horizon that have been introduced in [the authors, SIAM J. Numer. Anal. 59, No. 1, 525–557 (2021; Zbl 1466.65007)]. Numerical and comparative tests using TensorFlow illustrate the performance of our different algorithms, namely control learning by performance iteration (algorithms NNcontPI and ClassifPI), control learning by hybrid iteration (algorithms Hybrid-Now and Hybrid-LaterQ), on the 100-dimensional nonlinear PDEs examples from [W. E et al., Commun. Math. Stat. 5, No. 4, 349–380 (2017; Zbl 1382.65016)] and on quadratic backward stochastic differential equations as in [J.-F. Chassagneux and A. Richou, Ann. Appl. Probab. 26, No. 1, 262–304 (2016; Zbl 1334.60129)]. We also performed tests on low-dimension control problems such as an option hedging problem in finance, as well as energy storage problems arising in the valuation of gas storage and in microgrid management. Numerical results and comparisons to quantization-type algorithms Qknn, as an efficient algorithm to numerically solve low-dimensional control problems, are also provided.


93E03 Stochastic systems in control theory (general)
93C55 Discrete-time control/observation systems
68T07 Artificial neural networks and deep learning
Full Text: DOI arXiv


[1] Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. In: Machine Learning 47.2. ISSN: 1573-0565. doi:10.1023/A:1013689704352, pp 235-256 · Zbl 1012.68093
[2] Alasseur C, Balata A, Aziza SB, Maheshwari A, Tankov P, Warin X (2019) Regression Monte Carlo for microgrid management. In: ESAIM proceedings and surveys, CEMRACS 2017, pp 46-67 · Zbl 1416.93097
[3] Balata A, Huré C, Laurière M, Pham H, Pimentel I (2019) A class of finite-dimensional numerically solvable McKean-Vlasov control problems. In: ESAIM Proceedings and surveys, CEMRACS 2017, vol 19, pp 114-144 · Zbl 1417.93333
[4] Bertsimas D, Kogan L, Lo AW (2001) Hedging derivative securities and incomplete markets: an ε-arbitrage approach. In: Operations research 49.3, pp 372-397 · Zbl 1163.91381
[5] Carmona R, Ludkovski M (2010) Valuation of energy storage: an optimal switching approach. In: Quantitative finance 26.1, pp 262-304 · Zbl 1203.91286
[6] Chassagneux J-F, Richou A (2016) Numerical simulation of quadratic BSDEs. In: The annals of applied probabilities 26.1, pp 262-304 · Zbl 1334.60129
[7] Weinan E, Han J, Jentzen A (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. In: Communications in mathematics and statistics 5, vol 5, pp 349-380 · Zbl 1382.65016
[8] Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press · Zbl 1373.68009
[9] Heymann B, Bonnans JF, Martinon P, Silva FJ, Lanas F, Jiménez-Estévez G (2018) Continuous optimal control approaches to microgrid energy management. In: Energy Systems 9.1, pp 59-77
[10] Henry-Labordere P (2017) Deep primal-dual algorithm for BSDEs: Applications of machine learning to CVA and IM. In: SSRN:3071506
[11] Huré C, Pham H, Bachouch A, Langrené N (2018) Deep neural networks algorithms for stochastic control problems on finite horizon, part I: convergence analysis. In: arXiv:1812.04300 · Zbl 1466.65007
[12] Jiang DR, Powell WB (2015) An approximate dynamic programming algorithm for monotone value functions. In: Operations research 63.6, pp 1489-1511 · Zbl 1334.90194
[13] Kou S, Peng X, Xu X (2018) A general Monte Carlo algorithm with monotonicity for stochastic control problems. 2018 IMS Annual meeting on probability and statistics
[14] Ludkovski M, Maheshwari A (2019) Simulation methods for stochastic storage problems: a statistical learning perspective. In: Energy systems. issn: 1868-3975. doi:10.1007/s12667-018-0318-4
[15] Pagès G, Pham H, Printems J (2004) Optimal quantization methods and applications to numerical problems in finance. In: Handbook of computational and numerical methods in finance, pp 253-297 · Zbl 1138.91467
[16] Richou A (2010) Etude théorique et numérique des équations différentielles stochastiques rétrogrades. PhD thesis. Université, de Rennes 1
[17] Richou A (2011) Numerical simulation of BSDEs with drivers of quadratic growth. In: The annals of applied probability 21.5, pp 1933-1964 · Zbl 1274.60221
[18] Sutton RS, Barto AG (1998) Reinforcement learning. The MIT Press
[19] Wai-Nam QC, Mikael J, Warin X (2019) Machine learning for semi linear PDEs. In: Journal of scientific computing 79.3, pp 1667-1712 · Zbl 1433.68332
[20] Yong J, Zhou X (1999) Stochastic controls hamiltonian systems and HJB equations. Springer · Zbl 0943.93002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.