×

zbMATH — the first resource for mathematics

Average optimality for continuous-time Markov decision processes under weak continuity conditions. (English) Zbl 1307.90196
Summary: This paper considers the average optimality for a continuous-time Markov decision process in Borel state and action spaces, and with an arbitrarily unbounded nonnegative cost rate. The existence of a deterministic stationary optimal policy is proved under the conditions that allow the following; the controlled process can be explosive, the transition rates are weakly continuous, and the multifunction defining the admissible action spaces can be neither compact-valued nor upper semicontinuous.

MSC:
90C40 Markov and semi-Markov decision processes
60J25 Continuous-time Markov processes on general state spaces
PDF BibTeX XML Cite
Full Text: Euclid
References:
[1] Berberian, S. K. (1999). Fundamentals of Real Analysis . Springer, New York. · Zbl 0914.26001
[2] Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control . Academic Press, New York. · Zbl 0471.93002
[3] Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Syst. Control Lett. 16 , 387-392. · Zbl 0738.90082
[4] Cavazos-Cadena, R. and Salem-Silva, F. (2010). The discunted method and equivalence of average criteria for risk-sensitive Markov decision processes on Borel spaces. Appl. Math. Optimization 61 , 167-190. · Zbl 1196.60127
[5] Costa, O. L. V. and Dufour, F. (2012). Average control of Markov decision processes with Feller transition probabilities and general acton spaces. J. Math. Anal. Appl. 396 , 58-69. · Zbl 1275.90123
[6] Feinberg, E. A. (2012). Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs. In Optimization, Control, and Applications of Stochastic Systems , Birkhäuser, New York, pp. 77-97. · Zbl 1374.90402
[7] Feinberg, E. A. and Lewis, M. E. (2007). Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Operat. Res. 32 , 769-783. · Zbl 1341.90142
[8] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2012). Average cost Markov decision processes with weakly continuous transition probabilities. Math. Operat. Res. 37 , 591-607. · Zbl 1297.90173
[9] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Berge’s theorem for noncompact image sets. J. Math. Anal. Appl. 397 , 255-259. · Zbl 1252.49022
[10] Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Fatou’s lemma for weakly converging probabilities. Preprint, Department of Applied Mathematics and Statistics, State University of New York at Stony Brook. Available at http://arxiv.org/abs/1206.4073v2. · Zbl 1328.60009
[11] Feinberg, E. A., Mandava, M. and Shiryaev, A. N. (2014). On solutions of Kolmogorov’s equations for nonhomogeneous jump Markov processes. J. Math. Anal. Appl. 411 , 261-270. · Zbl 1328.60192
[12] Gíhman, \uI. Ī. and Skorohod, A. V. (1975). The Theory of Stochastic Processes. II . Springer, New York.
[13] Guo, X. (2007). Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math. Operat. Res. 32 , 73-87. · Zbl 1278.90426
[14] Guo, X. and Hernández-Lerma, O. (2003). Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans. Automatic Control 48 , 236-245. · Zbl 1364.90346
[15] Guo, X. and Hernández-Lerma, O. (2009). Continuous-Time Markov Decision Processes: Theory and Applications . Springer, Berlin. · Zbl 1209.90002
[16] Guo, X. and Liu, K. (2001). A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans. Automatic Control 46 , 1984-1989. · Zbl 1017.90120
[17] Guo, X. and Rieder, U. (2006). Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Prob. 16 , 730-756. · Zbl 1160.90010
[18] Guo, X. and Ye, L. (2010). New discount and average optimality conditions for continuous-time Markov decision processes. Adv. Appl. Prob. 42 , 953-985. · Zbl 1225.90152
[19] Guo, X. and Zhang, Y. (2013). Generalized discounted continuous-time Markov decision processes. Preprint. Available at http://arxiv.org/abs/1304.3314.
[20] Guo, X., Hernández-Lerma, O. and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. Top 14 , 177-261. · Zbl 1278.90427
[21] Guo, X., Huang, Y. and Song, X. (2012). Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optimization 50 , 23-47. · Zbl 1250.90108
[22] Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes . Springer, New York. · Zbl 0928.93002
[23] Hernández-Lerma, O. and Lasserre, J. B. (2000). Fatou’s lemma and Lebesgue’s convergence theorem for measures. J. Appl. Math. Stoch. Anal. 13 , 137-146. · Zbl 0961.28002
[24] Jaśkiewicz, A. (2009). Zero-sum ergodic semi-Markov games with weakly continuous transition probabilities. J. Optimization Theory Appl. 141 , 321-347. · Zbl 1169.91012
[25] Jaśkiewicz, A. and Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. J. Math. Anal. Appl. 316 , 495-509. · Zbl 1148.90015
[26] Jaśkiewicz, A. and Nowak, A. S. (2006). Optimality in Feller semi-Markov control processes. Operat. Res. Lett. 34 , 713-718. · Zbl 1112.90091
[27] Kitaev, M. Yu. and Rykov, V. V. (1995). Controlled Queueing Systems . CRC, Boca Raton, FL. · Zbl 0876.60077
[28] Kitayev, M. Yu. (1986). Semi-Markov and jump Markov controlled models: average cost criterion. Theory. Prob. Appl. 30 , 272-288. · Zbl 0586.90093
[29] Kuznetsov, S. E. (1981). Any Markov process in a Borel space has a transition function. Theory. Prob. Appl. 25 , 384-388. · Zbl 0456.60077
[30] Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49 , 2032-2061. · Zbl 1242.90283
[31] Piunovskiy, A. and Zhang, Y. (2012). The transformation method for continuous-time Markov decision processes. J. Optimization Theory Appl. 154 , 691-712. · Zbl 1256.90048
[32] Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Selected Topics on Continuous-time Controlled Markov Chains and Markov Games . Imperial College Press, London. · Zbl 1269.60004
[33] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York. · Zbl 0829.90134
[34] Zhu, Q. (2008). Average optimality for continuous-time Markov decision processes with a policy iteration approach. J. Math. Anal. Appl. 339 , 691-704. · Zbl 1156.90023
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.