zbMATH — the first resource for mathematics

Robust optimizers for nonlinear programming in approximate dynamic programming. (English) Zbl 1190.90221
Filipe, Joaquim (ed.) et al., Informatics in control, automation and robotics. Selected papers from the international conference on informatics in control, automation and robotics (INCINO 2007), Angers, France, May 9–12, 2007. Berlin: Springer (ISBN 978-3-540-85639-9/pbk). Lecture Notes in Electrical Engineering 24, 95-106 (2009).
Summary: Many stochastic dynamic programming tasks in continuous action-spaces are tackled through discretization. We here avoid discretization; then, approximate dynamic programming (ADP) involves (i) many learning tasks, performed here by support vector machines, for Bellman-function-regression; (ii) many non-linear-optimization tasks for action-selection, for which we compare many algorithms. We include discretizations of the domain as well as other non-linear-programming tools in our experiments, so that, by the way, we compare optimization approaches and discretization methods. We conclude that robustness is strongly required in the non-linear optimizations in ADP, and experimental results show that (i) discretization is sometimes inefficient, but some specific discretization is very efficient for “bang-bang” problems; (ii) simple evolutionary tools outperform quasi-random in a stable manner; (iii) gradient-based techniques are much less stable; (iv) for most high-dimensional “less unsmooth” problems covariance-matrix-adaptation is first ranked.
For the entire collection see [Zbl 1149.93006].

90C30 Nonlinear programming
90C39 Dynamic programming
Full Text: DOI