##
**Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses.**
*(English)*
Zbl 1244.93177

Summary: The stochastic optimal control of linear Networked Control System (NCS) with uncertain system dynamics and in the presence of network imperfections such as random delays and packet losses is derived. The proposed stochastic optimal control method uses an Adaptive Estimator (AE) and ideas from Q-learning to solve the infinite horizon optimal regulation of unknown NCS with time-varying system matrices. Next, a stochastic suboptimal control scheme which uses AE and Q-learning is introduced for the regulation of unknown linear time-invariant NCS that is derived using certainty equivalence property. Update laws for online tuning the unknown parameters of the AE to obtain the Q-function are derived. Lyapunov theory is used to show that all signals are Asymptotically Stable (AS) and that the estimated control signals converge to optimal or suboptimal control inputs. Simulation results are included to show the effectiveness of the proposed schemes. The result is an optimal control scheme that operates forward-in-time manner for unknown linear systems in contrast with standard Riccati equation-based schemes which function backward-in-time.

### MSC:

93E20 | Optimal stochastic control |

93C05 | Linear systems in control theory |

93E10 | Estimation and detection in stochastic control theory |

93D20 | Asymptotic stability in control theory |

### Software:

Approxrl
Full Text:
DOI

### References:

[1] | Al-Tamimi, A.; Lewis, F. L.; Abu-Khalaf, M., Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H\)-infinity control, Automatica, 43, 473-481 (2007) · Zbl 1137.93321 |

[2] | Antsaklis, P.; Baillieul, J., Special issue on networked control systems, IEEE Transactions on Automatic Control, 49, 1421-1423 (2004) · Zbl 1365.93005 |

[3] | Åstrom, K. J., Introduction to stochastic control theory (1970), Academic Press: Academic Press New York |

[5] | Barto, A. G.; Sutton, R. S.; Anderson, C. W., Neuron like elements that can solve difficult learning control problems, IEEE Transactions on Systems, Man and Cybernetics, 13, 835-846 (1983) |

[6] | Bertsekas, D. P.; Shreve, S. E., Stochastic optimal control: the discrete time case (1978), Academic Press: Academic Press New York · Zbl 0471.93002 |

[8] | Busoniu, L.; Babuska, R.; Schutter, B. D., Reinforcement learning and dynamic programming using function approximators (2010), CRC Press: CRC Press New York |

[9] | Carnevale, D.; Teel, A. R.; Nesic, D., A Lyapnov proof of improved maximum allowable transfer interval for networked control systems, IEEE Transactions on Automatic Control, 52, 892-897 (2007) · Zbl 1366.93431 |

[10] | Cloosterman, M. B.G; van de Wouw, N.; Heemels, W. P.M. H.; Nijmeijer, H., Stability of networked control systems with uncertain time-varying delays, IEEE Transactions on Automatic Control, 54, 1575-1580 (2009) · Zbl 1367.93459 |

[13] | Franklin, G. F.; Powell, J. D.; Emani-Naeini, A., Feedback control of dynamic systems (1994), Addison-Wesley: Addison-Wesley Reading, Massachusetts |

[14] | Goldsmith, A., Wireless communications (2003), Cambridge University Press: Cambridge University Press Cambridge, UK |

[15] | Green, M.; Moore, J. B., Persistency of excitation in linear systems, Systems and Control Letters, 7, 351-360 (1986) · Zbl 0607.93062 |

[16] | Guo, L., Estimating time-varying parameters by the Kalman filter based algorithm: stability and convergence, IEEE Transactions on Automatic Control, 35, 141-147 (1990) · Zbl 0704.93067 |

[17] | Halevi, Y.; Ray, A., Integrated communication and control systems: part I—analysis, Journal of Dynamic Systems Measurement and Control, 110, 367-373 (1988) |

[18] | Heemels, W. P.M. H.; Teel, A. R.; van de Wouw, N.; Nesic, D., Networked control systems with communication constraints: tradeoff between transmission intervals, delays and performance, IEEE Transactions on Automatic Control, 55, 1781-1796 (2010) · Zbl 1368.93627 |

[20] | Hu, S. S.; Zhu, Q. X., Stochastic optimal control and analysis of stability of networked control systems with long delay, Automatica, 39, 1877-1884 (2003) · Zbl 1175.93240 |

[21] | Jagannathan, S., Neural network control of nonlinear discrete-time systems (2006), CRC Press · Zbl 1123.93010 |

[22] | Lewis, F. L.; Syrmos, V. L., Optimal control (1995), Wiley: Wiley New York |

[23] | Lian, F.; Moyne, J.; Tilbury, D., Modeling and optimal controller design of networked control systems with multiple delays, International Journal of Control, 76, 591-606 (2003) · Zbl 1050.93038 |

[24] | Liou, L. W.; Ray, A., A stochastic regulator for integrated communication and control systems: part I—formulation of control law, ASME Journal of Dynamic Systems Measurement and Control, 4, 604-611 (1991) · Zbl 0752.93075 |

[25] | Maybeck, P. S., Stochastic models, estimation and control (1982), Academic press: Academic press New York · Zbl 0546.93063 |

[26] | Middleton, R. H.; Goodwin, G. C., Adaptive control of time-varying linear systems, IEEE Transactions on Automatic Control, 33, 150-155 (1988) · Zbl 0637.93041 |

[27] | Nilsson, J.; Bernhardsson, B.; Wittenmark, B., Stochastic analysis and control of real-time systems with random time delays, Automatica, 34, 57-64 (1998) · Zbl 0908.93073 |

[28] | Papoulis, A., Probability, random variables, and stochastic processes (1991), McGraw-Hill: McGraw-Hill New York · Zbl 0191.46704 |

[29] | Sastry, S.; Bodson, M., Adaptive control: stability, convergence, and robustness (1989), Prentice Hall: Prentice Hall Englewood Cliffs, NJ · Zbl 0721.93046 |

[31] | Stengel, R. F., Stochastic optimal control, theory and application (1986), John Wiley and Sons: John Wiley and Sons New York · Zbl 0666.93126 |

[34] | Werbos, P. J., A menu of designs for reinforcement learning over time (1991), MIT Press: MIT Press MA |

[35] | Werbos, P. J., Approximate dynamic programming for real-time control and neural modeling, (Handbook of intelligent control (1992), Van Nostrand Reinhold: Van Nostrand Reinhold New York) |

[36] | Wonham, W. M., On a matrix Riccati equation of stochastic control, SIAM Journal of Control, 6, 681-697 (1968) · Zbl 0182.20803 |

[37] | Wu, J.; Chen, T., Design of networked control systems with packet dropouts, IEEE Transactions on Automatic Control, 52, 1314-1319 (2007) · Zbl 1366.93215 |

[38] | Zhang, W.; Branicky, M. S.; Phillips, S., Stability of networked control systems, IEEE Control Systems Magazine, 21, 84-99 (2001) |

[39] | Zhang, H.; Luo, Y.; Liu, D., Neural network based near optimal control for a class of discrete-time affine nonlinear system with control constraints, IEEE Transactions on Neural Network, 20, 1490-1503 (2009) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.