den Hengst, Floris; François-Lavet, Vincent; Hoogendoorn, Mark; van Harmelen, Frank Planning for potential: efficient safe reinforcement learning. (English) Zbl 07570157 Mach. Learn. 111, No. 6, 2255-2274 (2022). MSC: 68T05 PDF BibTeX XML Cite \textit{F. den Hengst} et al., Mach. Learn. 111, No. 6, 2255--2274 (2022; Zbl 07570157) Full Text: DOI
De Moor, Bram J.; Gijsbrechts, Joren; Boute, Robert N. Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management. (English) Zbl 1506.90010 Eur. J. Oper. Res. 301, No. 2, 535-545 (2022). MSC: 90B05 68T07 PDF BibTeX XML Cite \textit{B. J. De Moor} et al., Eur. J. Oper. Res. 301, No. 2, 535--545 (2022; Zbl 1506.90010) Full Text: DOI