zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses. (English) Zbl 1244.93177
Summary: The stochastic optimal control of linear Networked Control System (NCS) with uncertain system dynamics and in the presence of network imperfections such as random delays and packet losses is derived. The proposed stochastic optimal control method uses an Adaptive Estimator (AE) and ideas from Q-learning to solve the infinite horizon optimal regulation of unknown NCS with time-varying system matrices. Next, a stochastic suboptimal control scheme which uses AE and Q-learning is introduced for the regulation of unknown linear time-invariant NCS that is derived using certainty equivalence property. Update laws for online tuning the unknown parameters of the AE to obtain the Q-function are derived. Lyapunov theory is used to show that all signals are Asymptotically Stable (AS) and that the estimated control signals converge to optimal or suboptimal control inputs. Simulation results are included to show the effectiveness of the proposed schemes. The result is an optimal control scheme that operates forward-in-time manner for unknown linear systems in contrast with standard Riccati equation-based schemes which function backward-in-time.
MSC:
93E20Optimal stochastic control (systems)
93C05Linear control systems
93E10Estimation and detection in stochastic control
93D20Asymptotic stability of control systems
References:
[1]Al-Tamimi, A.; Lewis, F. L.; Abu-Khalaf, M.: Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control, Automatica 43, 473-481 (2007) · Zbl 1137.93321 · doi:10.1016/j.automatica.2006.09.019
[2]Antsaklis, P.; Baillieul, J.: Special issue on networked control systems, IEEE transactions on automatic control 49, 1421-1423 (2004)
[3]Strom, K. J. å: Introduction to stochastic control theory, (1970)
[4]Azimi-Sadjadi, B. (2003) Stability of networked control systems in the presence of packet losses. In Proceedings of conference on decision and control (pp. 676–681).
[5]Barto, A. G.; Sutton, R. S.; Anderson, C. W.: Neuron like elements that can solve difficult learning control problems, IEEE transactions on systems, man and cybernetics 13, 835-846 (1983)
[6]Bertsekas, D. P.; Shreve, S. E.: Stochastic optimal control: the discrete time case, (1978)
[7]Branicky, M.S., Phillips, S.M., & Zhang, W. (2000) Stability of networked control systems: explicit analysis of delay. In Proceedings of american control conference, (pp. 2352–2357).
[8]Busoniu, L.; Babuska, R.; Schutter, B. D.: Reinforcement learning and dynamic programming using function approximators, (2010)
[9]Carnevale, D.; Teel, A. R.; Nesic, D.: A lyapnov proof of improved maximum allowable transfer interval for networked control systems, IEEE transactions on automatic control 52, 892-897 (2007)
[10]Cloosterman, M. B. G; Van De Wouw, N.; Heemels, W. P. M.H.; Nijmeijer, H.: Stability of networked control systems with uncertain time-varying delays, IEEE transactions on automatic control 54, 1575-1580 (2009)
[11]Dierks, T., & Jagannathan, S. (2009) Optimal control of affine nonlinear discrete-time systems with unknown internal dynamics. In Proceedings of conference on decision and control, (pp. 6750–6755).
[12]Dierks, T., Thumati, T.B., & Jagannathan, S. (2009) Adaptive dynamics programming-based optimal control of unknown affine nonlinear discrete-time systems In Proc. Intern. Joint Conf. Neur. Netwo. Vol. 711–716.
[13]Franklin, G. F.; Powell, J. D.; Emani-Naeini, A.: Feedback control of dynamic systems, (1994)
[14]Goldsmith, A.: Wireless communications, (2003)
[15]Green, M.; Moore, J. B.: Persistency of excitation in linear systems, Systems and control letters 7, 351-360 (1986) · Zbl 0607.93062 · doi:10.1016/0167-6911(86)90052-6
[16]Guo, L.: Estimating time-varying parameters by the Kalman filter based algorithm: stability and convergence, IEEE transactions on automatic control 35, 141-147 (1990) · Zbl 0704.93067 · doi:10.1109/9.45169
[17]Halevi, Y.; Ray, A.: Integrated communication and control systems: part I–analysis, Journal of dynamic systems measurement and control 110, 367-373 (1988)
[18]Heemels, W. P. M.H.; Teel, A. R.; Van De Wouw, N.; Nesic, D.: Networked control systems with communication constraints: tradeoff between transmission intervals, delays and performance, IEEE transactions on automatic control 55, 1781-1796 (2010)
[19]Hesphanha, J.P., Naghshtabrizi, P., & Xu, Y. (2007) A survey of recent results in networked control system. In Proceedings of IEEE. 95. (pp. 138–162).
[20]Hu, S. S.; Zhu, Q. X.: Stochastic optimal control and analysis of stability of networked control systems with long delay, Automatica 39, 1877-1884 (2003) · Zbl 1175.93240 · doi:10.1016/S0005-1098(03)00196-1
[21]Jagannathan, S.: Neural network control of nonlinear discrete-time systems, (2006)
[22]Lewis, F. L.; Syrmos, V. L.: Optimal control, (1995)
[23]Lian, F.; Moyne, J.; Tilbury, D.: Modeling and optimal controller design of networked control systems with multiple delays, International journal of control 76, 591-606 (2003) · Zbl 1050.93038 · doi:10.1080/0020717031000098426
[24]Liou, L. W.; Ray, A.: A stochastic regulator for integrated communication and control systems: part I–formulation of control law, ASME journal of dynamic systems measurement and control 4, 604-611 (1991) · Zbl 0752.93075 · doi:10.1115/1.2896464
[25]Maybeck, P. S.: Stochastic models, estimation and control, (1982)
[26]Middleton, R. H.; Goodwin, G. C.: Adaptive control of time-varying linear systems, IEEE transactions on automatic control 33, 150-155 (1988) · Zbl 0637.93041 · doi:10.1109/9.382
[27]Nilsson, J.; Bernhardsson, B.; Wittenmark, B.: Stochastic analysis and control of real-time systems with random time delays, Automatica 34, 57-64 (1998) · Zbl 0908.93073 · doi:10.1016/S0005-1098(97)00170-2
[28]Papoulis, A.: Probability, random variables, and stochastic processes, (1991)
[29]Sastry, S.; Bodson, M.: Adaptive control: stability, convergence, and robustness, (1989) · Zbl 0721.93046
[30]Schenato, L., Sinopoli, B., Franceschetti, M., Poolla, K., & Sastry, S. (2007). Foundations of control and estimation over lossy networks. In Proceedings of IEEE. 95. (pp. 163–187).
[31]Stengel, R. F.: Stochastic optimal control, theory and application, (1986) · Zbl 0666.93126
[32]Walsh, G.C., Ye, H., & Bushnell, L. (1999) Stability analysis of networked control systems. In Proceedings of american control conference (pp. 2876–2880).
[33]Watkins, C. (1989). Learning from delayed rewards, Ph.D. Thesis, Cambridge, England: Cambridge University.
[34]Werbos, P. J.: A menu of designs for reinforcement learning over time, (1991)
[35]Werbos, P. J.: Approximate dynamic programming for real-time control and neural modeling, (1992)
[36]Wonham, W. M.: On a matrix Riccati equation of stochastic control, SIAM journal of control 6, 681-697 (1968) · Zbl 0182.20803
[37]Wu, J.; Chen, T.: Design of networked control systems with packet dropouts, IEEE transactions on automatic control 52, 1314-1319 (2007)
[38]Zhang, W.; Branicky, M. S.; Phillips, S.: Stability of networked control systems, IEEE control systems magazine 21, 84-99 (2001)
[39]Zhang, H.; Luo, Y.; Liu, D.: Neural network based near optimal control for a class of discrete-time affine nonlinear system with control constraints, IEEE transactions on neural network 20, 1490-1503 (2009)