Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. (English) Zbl 1087.49022

Summary: The Hamilton-Jacobi-Bellman (HJB) equation corresponding to constrained control is formulated using a suitable nonquadratic functional. It is shown that the constrained optimal control law has the largest region of asymptotic stability (RAS). The value function of this HJB equation is solved by solving a sequence of cost functions satisfying a sequence of Lyapunov equations (LE). A neural network is used to approximate the cost function associated with each LE using the method of least-squares on a well-defined region of attraction of an initial stabilizing controller. As the order of the neural network is increased, the least-squares solution of the HJB equation converges uniformly to the exact solution of the inherently nonlinear HJB equation associated with the saturating control inputs. The result is a nearly optimal constrained state feedback controller that has been tuned a priori off-line.


49L20 Dynamic programming in optimal control and differential games
93C10 Nonlinear systems in control theory
92B20 Neural networks for/in biological studies, artificial life and related topics
Full Text: DOI


[1] Adams, R.; Fournier, J., Sobolev spaces, (2003), Academic Press New York · Zbl 1098.46001
[2] Apostol, T., Mathematical analysis, (1974), Addison-Wesley Reading, MA
[3] Bardi, M.; Capuzzo-Dolcetta, I., Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations, (1997), Birkhauser Boston, MA · Zbl 0890.49011
[4] Beard, R. (1995). Improving the closed-loop performance of nonlinear systems. Ph.D. Thesis, Rensselaer Polytechnic Institute, Troy, NY.
[5] Beard, R.; Saridis, G.; Wen, J., Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation, Automatica, 33, 12, 2159-2177, (1997) · Zbl 0949.93022
[6] Beard, R.; Saridis, G.; Wen, J., Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation, Journal of optimization theory and application, 96, 3, 589-626, (1998) · Zbl 0916.49021
[7] Bernstein, D.S., Optimal nonlinear, but continuous, feedback control of systems with saturating actuators, International journal of control, 62, 5, 1209-1216, (1995) · Zbl 0843.93025
[8] Bertsekas, D.P.; Tsitsiklis, J.N., Neuro-dynamic programming, (1996), Athena Scientific Belmont, MA · Zbl 0924.68163
[9] Chen, F.C.; Liu, C.C., Adaptively controlling nonlinear continuous-time systems using multilayer neural networks, IEEE transactions on automatic control, 39, 6, 1306-1310, (1994) · Zbl 0812.93050
[10] Evans, M.; Swartz, T., Approximating integrals via Monte Carlo and deterministic methods, (2000), Oxford University Press Oxford · Zbl 0958.65009
[11] Finlayson, B.A., The method of weighted residuals and variational principles, (1972), Academic Press New York · Zbl 0319.49020
[12] Han, D., Balakrishnan, S. N. (2000). State-constrained agile missile control with adaptive-critic-based neural networks. Proceedings of the American control conference (pp. 1929-1933).
[13] Hornik, K.; Stinchcombe, M.; White, H., Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks, Neural networks, 3, 551-560, (1990)
[14] Huang, C.-S.; Wang, S.; Teo, K.L., Solving Hamilton-Jacobi-Bellman equations by a modified method of characteristics, Nonlinear analysis, 40, 279-293, (2000) · Zbl 0959.49021
[15] Huang, J.; Lin, C.F., Numerical approach to computing nonlinear \(H_\infty\) control laws, Journal of guidance, control, and dynamics, 18, 5, 989-994, (1995) · Zbl 0841.93018
[16] Khalil, H., Nonlinear systems, (2003), Prentice-Hall Upper Saddle River, NJ
[17] Kim, Y.H.; Lewis, F.L.; Dawson, D., Intelligent optimal control of robotic manipulators using neural networks, Automatica, 36, 1355-1364, (2000) · Zbl 1002.93039
[18] Kleinman, D. (1968). On an iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control, 114-115.
[19] Lee, H.W.J.; Teo, K.L.; Lee, W.R.; Wang, S., Construction of suboptimal feedback control for chaotic systems using B-splines with optimally chosen knot points, International journal of bifurcation and chaos, 11, 9, 2375-2387, (2001)
[20] Lewis, F.L.; Syrmos, V.L., Optimal control, (1995), Wiley New York
[21] Lewis, F.L.; Jagannathan, S.; Yesildirek, A., Neural network control of robot manipulators and nonlinear systems, (1999), Taylor & Francis London
[22] Lio, F.D., On the Bellman equation for infinite horizon problems with unbounded cost functional, Applied mathematics and optimization, 41, 171-197, (2000) · Zbl 0952.49023
[23] Liu, X., Balakrishnan, S. N. (2000). Convergence analysis of adaptive critic based optimal control. Proceedings of American control conference (pp. 1929-1933).
[24] Lyshevski, S. E. (1996). Constrained optimization and control of nonlinear systems: new results in optimal control. Proceedings of the IEEE conference on decision and control (pp. 541-546).
[25] Lyshevski, S. E. (1998). Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals. Proceedings of American control conference, June 1998 (pp. 205-209).
[26] Lyshevski, S.E., Control systems theory with engineering applications, (2001), Birkhauser Boston, MA · Zbl 1026.93020
[27] Lyshevski, S. E. (2001b). Role of performance functionals in control laws design. Proceedings of American control conference (pp. 2400-2405).
[28] Lyshevski, S. E., Meyer, A. U. (1995). Control system analysis and design upon the Lyapunov method.Proceedings of American control conference (pp. 3219-3223).
[29] Mikhlin, S.G., Variational methods in mathematical physics, (1964), Pergamon Oxford · Zbl 0119.19002
[30] Miller, W.T.; Sutton, R.; Werbos, P., Neural networks for control, (1990), The MIT Press Cambridge, Massachusetts
[31] Munos, R., Baird, L. C., Moore, A. (1999). Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation. International joint conference on neural networks IJCNN, 3 (pp. 2152-2157).
[32] Murray, J.; Cox, C.; Lendaris, G.; Saeks, R., Adaptive dynamic programming, IEEE transactions on systems, MAN, and cybernetics-part capplications and reviews, 32, 2, 140-153, (2002)
[33] Narendra, K.S.; Lewis, F.L., Special issue on neural network feedback control, Automatica, 37, 8, 1147-1148, (2001) · Zbl 0986.00038
[34] Parisini, T.; Zoppoli, R., Neural approximations for infinite-horizon optimal control of nonlinear stochastic systems, IEEE transactions on neural networks, 9, 6, 1388-1408, (1998)
[35] Polycarpou, M.M., Stable adaptive neural control scheme for nonlinear systems, IEEE transactions on automatic control, 41, 3, 447-451, (1996) · Zbl 0846.93060
[36] Rovithakis, G.A.; Christodoulou, M.A., Adaptive control of unknown plants using dynamical neural networks, IEEE transactions on systems, man, and cybernetics, 24, 3, 400-412, (1994) · Zbl 1371.93112
[37] Saberi, A.; Lin, Z.; Teel, A., Control of linear systems with saturating actuators, IEEE transactions on automatic control, 41, 3, 368-378, (1996) · Zbl 0853.93046
[38] Sadegh, N., A perceptron network for functional identification and control of nonlinear systems, IEEE transactions on neural networks, 4, 6, 982-988, (1993)
[39] Sanner, R. M., Slotine, J. J. E. (1991). Stable adaptive control and recursive identification using radial gaussian networks. Proceedings of IEEE conference on decision and control (pp. 2116-2123) Brighton.
[40] Saridis, G.; Lee, C.S., An approximation theory of optimal control for trainable manipulators, IEEE transactions on systems, man, cybernetics, 9, 3, 152-159, (1979) · Zbl 0398.49001
[41] Sussmann, H.; Sontag, E.D.; Yang, Y., A general result on the stabilization of linear systems using bounded controls, IEEE transactions on automatic control, 39, 12, 2411-2425, (1994) · Zbl 0811.93046
[42] Van Der Schaft, A.J., \(L_2\)-gain analysis of nonlinear systems and nonlinear state feedback \(H_\infty\) control, IEEE transactions on automatic control, 37, 6, 770-784, (1992) · Zbl 0755.93037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.