×

Asymptotic normality of discrete-time Markov control processes. (English) Zbl 1200.93138

Summary: We study the asymptotic normality of discrete-time Markov control processes in Borel spaces, with possibly unbounded cost. Under suitable hypotheses, we show that the cost sequence is asymptotically normal. As a special case, we obtain a central limit theorem for (noncontrolled) Markov chains.

MSC:

93E20 Optimal stochastic control
90C40 Markov and semi-Markov decision processes
60F05 Central limit and other weak theorems
60J05 Discrete-time Markov processes on general state spaces
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Gordienko, E. and Hernández-Lerma, O. (1995). Average cost Markov control processes with weigthed norms: existence of canonical policies. Appl. Math. 23 , 199–218. · Zbl 0829.93067
[2] Gordienko, E. and Hernández-Lerma, O. (1995). Average cost Markov control processes with weigthed norms: value iteration. Appl. Math. 23 , 219–237. · Zbl 0829.93068
[3] Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes. Springer, New York. · Zbl 0853.93106
[4] Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-time Markov Control Processes. Springer, New York. · Zbl 0928.93002
[5] Hernández-Lerma, O. and Lasserre, J. B. (2001). Zero-sum stochastic games in Borel spaces: average payoff criteria. SIAM J. Control Optimization 39 , 1520–1539. · Zbl 1140.91319 · doi:10.1137/S0363012999361962
[6] Hernández-Lerma, O., Vega-Amaya, O. and Carrasco, G. (1999). Sample-path optimality and variance-minimization of average cost Markov control processes. SIAM J. Control Optimization 38 , 79–93. · Zbl 0951.93074 · doi:10.1137/S0363012998340673
[7] Hilgert, N. and Hernández-Lerma, O. (2003). Bias optimality versus strong 0-discount optimality in Markov control processes with unbounded costs. Acta Appl. Math. 77 , 215–235. · Zbl 1049.93089 · doi:10.1023/A:1024996308133
[8] Jarner, S. F. and Roberts, G. O. (2002). Polynomial convergence rates of Markov chains. Ann. Appl. Prob. 12 , 224–247. · Zbl 1012.60062 · doi:10.1214/aoap/1015961162
[9] Lánská, V. (1986). A note on estimation in controlled diffusion processes. Kybernetika 22 , 133–141. · Zbl 0604.93054
[10] Luque-Vásquez, F. and Hernández-Lerma, O. (1999). Semi-Markov control models with average costs. Appl. Math. 26 , 315–331. · Zbl 1050.90566
[11] Mandl, P. (1971). On the variance in controlled Markov chains. Kybernetika 7 , 1–12. · Zbl 0215.25902
[12] Mandl, P. (1973). A connection between controlled Markov chains and martingales. Kybernetika 9 , 237–241. · Zbl 0265.60060
[13] Mandl, P. (1974). Estimation and control in Markov chains. Adv. Appl. Prob. 6 , 40–60. JSTOR: · Zbl 0281.60070 · doi:10.2307/1426206
[14] Mandl, P. (1974). On the asymptotic normality of the reward in a controlled Markov chain. In Progress in Statistics (European Meeting of Statisticians, Budapest, 1972), Vol. II, Colloquia Mathematica Societatis János Bolyai Vol. 9, North-Holland, Amsterdam, pp. 499–505. · Zbl 0355.90072
[15] Mandl, P. and Lau, M. (1991). Two extensions of asymptotic methods in controlled Markov chains. Ann. Operat. Res. 28 , 67–79. · Zbl 0754.60081 · doi:10.1007/BF02055575
[16] Mendoza-Pérez, A. (2008). Asymptotic normality of average cost Markov control processes. Morfismos 12 , 33–52.
[17] Mendoza-Pérez, A. (2008). Pathwise average reward Markov control processes. Doctoral Thesis, CINVESTAV-IPN. Available at http://www.math.cinvestav.mx/ohernand_students.
[18] Mendoza-Pérez, A. F. and Hernández-Lerma, O. (2010). Markov control processes with pathwise constraints. Math. Meth. Operat. Res. 71 , 477–502. · Zbl 1196.93101 · doi:10.1007/s00186-010-0311-8
[19] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic Stability. Springer, London. · Zbl 0925.60001
[20] Prieto-Rumeau, T. and Hernández-Lerma, O. (2009). Variance minimization and the overtaking optimality approach to continuous–time controlled Markov chains. Math. Meth. Operat. Res. 70 , 527–540. · Zbl 1177.93101 · doi:10.1007/s00186-008-0276-z
[21] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York. · Zbl 0829.90134
[22] Vega-Amaya, O. (1998). Markov control processes in Borel spaces: undiscounted criteria. Doctoral Thesis, UAM-Iztapalapa (in Spanish). · Zbl 0906.93062
[23] Zhu, Q. X. and Guo, X. P. (2007). Markov decision processes with variance minimization: a new condition and approach. Stoch. Anal. Appl. 25 , 577–592. · Zbl 1152.90646 · doi:10.1080/07362990701282807
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.