zbMATH — the first resource for mathematics

A back propagation through time-like min-max optimal control algorithm for nonlinear systems. (English) Zbl 1272.49077
Summary: This paper presents a conjugate gradient-based algorithm for feedback min-max optimal control of nonlinear systems. The algorithm has a backward-in-time recurrent structure similar to the Back Propagation Through Time (BPTT) algorithm. The control law is given as the output of the one-layer NN. The main contribution of the paper includes the integration of BPTT techniques, conjugate gradient methods, Adams method for solving ODEs and automatic differentiation, to provide an effective, numerically robust algorithm for solving optimal min-max control problems. The proposed algorithm is evaluated on a robotic system with two DOFs.

49N35 Optimal feedback synthesis
49M30 Other numerical methods in calculus of variations (MSC2010)
93C10 Nonlinear systems in control theory
93C85 Automated systems (robots, etc.) in control theory
Full Text: DOI
[1] Kreim, Minimizing the maximum heating of a re-entering space shuttle: an optimal contol problem with multiple control constraints, Optimal Control Applications and Methods 17 (1) pp 45– (1996) · Zbl 0863.49028 · doi:10.1002/(SICI)1099-1514(199601/03)17:1<45::AID-OCA564>3.0.CO;2-X
[2] Chen, A minimax tracking design for wheeled vehicles with trailer based on adaptive fuzzy elimination scheme, IEEE Transactions on Control Systems Technology 8 (3) pp 418– (2000) · doi:10.1109/87.845873
[3] Qingguo, Planning for dynamic multiagent planar manipulation with uncertainty: a game theoretic approach, IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 33 (5) pp 620– (2003) · doi:10.1109/TSMCA.2003.817392
[4] Gruber, Control of a pilot plant using QP based min-max predictive control, Control Engineering Practice 17 (11) pp 1358– (2009) · doi:10.1016/j.conengprac.2009.06.011
[5] Isaacs, Differential Games. A Mathematical Theory with Application to Warfare and Pursuit, Control and Optimization (1965)
[6] Basar, H Optimal Control and Related Minimax Design Problems, 2. ed. (1995) · doi:10.1007/978-0-8176-4757-5
[7] Abu-Khalaf, Policy iterations on the Hamilton-Jacobi-Isaacs equation for H state feedback control with input saturation, IEEE Transactions on Automatic Control 51 (12) pp 1989– (2006) · Zbl 1366.93147 · doi:10.1109/TAC.2006.884959
[8] Abu-Khalaf, Neurodynamic programming and zero-sum games for constrained control systems, IEEE Transactions on Neural Networks 19 (7) pp 1243– (2008) · doi:10.1109/TNN.2008.2000204
[9] Werbos, Backpropagation through time: what it does and how to do it, Proceedings of IEEE 78 (10) pp 1550– (1990) · doi:10.1109/5.58337
[10] Kasać J Deur J Novaković B Kolmanovsky I A conjugate gradient-based BPTT-like optimal control algorithm IEEE International Conference on Control 2009 861 866
[11] Kasać, A conjugate gradient-based BPTT-like optimal control algorithm with vehicle dynamics control application, IEEE Transactions on Control Systems Technology 19 (6) pp 1587– (2011) · doi:10.1109/TCST.2010.2084088
[12] Plumer, Optimal control of terminal processes using neural networks, IEEE Transactions on Neural Networks 7 (2) pp 408– (1996) · doi:10.1109/72.485676
[13] Hairer, Solving Ordinary Differential Equations I-Nonstiff Problems, 2. ed. (2008)
[14] Griewank, Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, 2. ed. (2008) · Zbl 1159.65026
[15] Campbell, Utilization of automatic differentiation in control algorithms, Computer Physics Communications 39 (5) pp 1047– (1994) · Zbl 0814.93048
[16] Griese, Evaluating gradients in optimal control: continuous adjoints versus automatic differentiation, Journal of Optimization Theory and Applications 122 (1) pp 63– (2004) · Zbl 1130.49308 · doi:10.1023/B:JOTA.0000041731.71309.f1
[17] Walther, Automatic differentiation of explicit Runge-Kutta methods for optimal control, Computational Optimization and Applications 36 (1) pp 83– (2007) · Zbl 1278.49037 · doi:10.1007/s10589-006-0397-3
[18] Kasać, Optimal feedback control of nonlinear systems with control vector constraints, Strojarstvo 43 (4-6) pp 133– (2001)
[19] Tsiotras, An L2 disturbance attenuations solution to the nonlinear benchmark problem, International Journal of Robust and Nonlinear Control 8 (4-5) pp 311– (1998) · Zbl 0908.93031 · doi:10.1002/(SICI)1099-1239(19980415/30)8:4/5<311::AID-RNC357>3.0.CO;2-F
[20] Fan, The Continuous Maximum Principle (1966)
[21] Tollenaere, SuperSAB: fast adaptive backpropagation with good scaling properties, Neural Networks 3 (5) pp 561– (1990) · doi:10.1016/0893-6080(90)90006-7
[22] Nocedal, Numerical Optimization (2006)
[23] Snyman, Practical Mathematical Optimization (2005) · Zbl 1104.90003
[24] Dai, A nonlinear conjugate gradient method with a strong global convergence property, SIAM Journal on Optimization 10 (1) pp 177– (1999) · Zbl 0957.65061 · doi:10.1137/S1052623497318992
[25] Polak, Note sur la convergence de méthodes de directions conjuguées, Revue française d’informatique et de recherche opérationnelle, série rouge 3 (1) pp 35– (1969)
[26] Hestenes, Methods of conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards 49 (6) pp 409– (1952) · Zbl 0048.09901
[27] Forth, An efficient overloaded implementation of forward mode automatic differentiation in MATLAB, ACM Transactions on Mathematical Software 32 (2) pp 195– (2006) · Zbl 1365.65053 · doi:10.1145/1141885.1141888
[28] Shampine, Using AD to solve BVPs in MATLAB, ACM Transactions on Mathematical Software 31 (1) pp 79– (2005) · Zbl 1073.65065 · doi:10.1145/1055531.1055535
[29] Kelly, Control of Robot Manipulators in Joint Space (2005)
[30] Zavala-Río, A natural saturating extension of the PD-with-desired-gravity-compensation control law for robot manipulators with bounded inputs, IEEE Transactions on Robotics 23 (2) pp 386– (2007) · doi:10.1109/TRO.2007.892224
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.