Hernández-Lerma, Onésimo; Muñoz de Özak, Myriam Discrete-time Markov control processes with discounted unbounded costs: optimality criteria. (English) Zbl 0771.93054 Kybernetika 28, No. 3, 191-212 (1992). Summary: We consider discrete-time Markov control processes with Borel state and control spaces, unbounded costs per stage and not necessarily compact control constraint sets. The basic control problem we are concerned with is to minimize the infinite-horizon, expected total discounted cost. Under easily verifiable assumptions, we provide characterizations of the optimal cost function and optimal policies, including all previously known optimality criteria, such as Bellman’s principle of optimality, and the martingale and discrepancy function criteria. The convergence of value iteration, policy iteration and other approximation procedures is also discussed, together with criteria for asymptotic optimality. Cited in 11 Documents MSC: 93C55 Discrete-time control/observation systems 60J99 Markov processes 93E03 Stochastic systems in control theory (general) Keywords:discrete-time Markov control processes; Borel state; optimal cost function; Bellman’s principle of optimality × Cite Format Result Cite Review PDF Full Text: EuDML Link References: [1] A. Bensoussan: Stochastic control in discrete time and applications to the theory of production. Math. Programm. Study 18 (1982), 43-60. [2] D. P. Bertsekas: Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs, N.J. 1987. · Zbl 0649.93001 [3] D. P. Bertsekas, S. E. Shreve: Stochastic Optimal Control: The Discrete Time Case. Academic Press, New York 1978. · Zbl 0471.93002 [4] R.N. Bhattacharya, M. Majumdar: Controlled semi-Markov models - the discounted case. J. Statist. Plann. Inference 21 (1989), 365-381. · Zbl 0673.93089 · doi:10.1016/0378-3758(89)90053-0 [5] D. Blackwell: Discounted dynamic programming. Ann. Math. Statist. 36 (1965), 226-235. · Zbl 0133.42805 · doi:10.1214/aoms/1177700285 [6] R.S. Bucy: Stability and positive supermartingales. J. Diff. Eq. 1 (1965), 151-155. · Zbl 0203.17505 · doi:10.1016/0022-0396(65)90016-1 [7] R. Cavazos-Cadena: Finite-state approximations for denumerable state discounted Markov decision processes. Appl. Math. Optim. 11, (1986), 1-26. · Zbl 0606.90132 · doi:10.1007/BF01442225 [8] M.H.A. Davis: Martingale methods in stochastic control. Lecture Notes in Control and Inform. Sci. 16 (1979), 85-117. · Zbl 0409.93052 [9] E. B. Dynkin, A. A. Yushkevich: Controlled Markov Processes. Springer-Verlag, New York 1979. [10] O. Hernández-Lerma: Lyapunov criteria for stability of differential equations with Markov parameters. Bol. Soc. Mat. Mexicana 24 (1979), 27-48. · Zbl 0486.60051 [11] O. Hernández-Lerma: Adaptive Markov Control Processes. Springer-Verlag, New York 1989. · Zbl 0698.90053 [12] O. Hernández-Lerma, R. Cavazos-Cadena: Density estimation and adaptive control of Markov processes: average and discounted criteria. Acta Appl. Math. 20 (1990), 285-307. · Zbl 0717.93066 · doi:10.1007/BF00049572 [13] O. Hernández-Lerma, J. B. Lasserre: Average cost optimal policies for Markov control processes with Borel state space and unbounded costs. Syst. Control Lett. 15 (1990), 349-356. · Zbl 0723.93080 · doi:10.1016/0167-6911(90)90108-7 [14] O. Hernández-Lerma, J. B. Lasserre: Value iteration and rolling plans for Markov control processes with unbounded rewards. J. Math. Anal. Appl. · Zbl 0781.90093 · doi:10.1006/jmaa.1993.1242 [15] O Hernández-Lerma, J. B. Lasserre: Error bounds for rolling horizon policies in discrete-time Markov control processes. IEEE Trans. Automat. Control 35 (1990), 1118-1124. · Zbl 0724.93087 · doi:10.1109/9.58554 [16] O. Hernández-Lerma R. Montes de Oca, R. Cavazos-Cadena: Recurrence conditions for Markov decision processes with Borel state space: a survey. Ann. Oper. Res. 28 (1991), 29-46. · Zbl 0717.90087 · doi:10.1007/BF02055573 [17] O. Hernández-Lerma, W. Runggaldier: Monotone approximations for convex stochastic control problems (submitted for publication). · Zbl 0832.93062 [18] K. Hinderer: Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter. Springer-Verlag, Berlin - Heidelberg - New York 1970. · Zbl 0202.18401 [19] A. Hordijk, H.C. Tijms: A counterexample in discounted dynamic programming. J. Math. Anal. Appl. 39 (1972), 455-457. · Zbl 0238.49017 · doi:10.1016/0022-247X(72)90216-8 [20] H.J. Kushner: Optimal discounted stochastic control for diffusion processes. SIAM J. Control 5 (1967), 520-531. · Zbl 0178.20003 · doi:10.1137/0305032 [21] S.A. Lippman: On the set of optimal policies in discrete dynamic programming. J. Math. Anal. Appl. 24 (1968), 2,440-445. · Zbl 0194.20602 · doi:10.1016/0022-247X(68)90042-5 [22] S.A. Lippman: On dynamic programming with unbounded rewards. Manag. Sci. 21 (1975), 1225-1233. · Zbl 0309.90017 · doi:10.1287/mnsc.21.11.1225 [23] P. Mandl: On the variance in controlled Markov chains. Kybernetika 7 (1971), 1, 1-12. · Zbl 0215.25902 [24] P. Mandl: A connection between controlled Markov chains and martingales. Kybernetika 9 (1973), 4, 237-241. · Zbl 0265.60060 [25] S.P. Meyn: Ergodic theorems for discrete time stochastic systems using a stochastic Lyapunov function. SIAM J. Control Optim. 27 (1989), 1409-1439. · Zbl 0681.60067 · doi:10.1137/0327073 [26] U. Rieder: On optimal policies and martingales in dynamic programming. J. Appl. Probab. 13 (1976), 507-518. · Zbl 0353.90091 · doi:10.2307/3212470 [27] U. Rieder: Measurable selection theorems for optimization problems. Manuscripta Math. 24 (1978), 115-131. · Zbl 0385.28005 · doi:10.1007/BF01168566 [28] M. Schäl: Estimation and control in discounted stochastic dynamic programming. Stochastics 20 (1987), 51-71. · Zbl 0621.90092 · doi:10.1080/17442508708833435 [29] J. Wessels: Markov programming by successive approximations with respect to weighted supremum norms. J. Math. Anal. Appl. 58 (1977), 326-335. · Zbl 0354.90087 · doi:10.1016/0022-247X(77)90210-4 [30] W. Whitt: Approximations of dynamic programs. I. Math. Oper. Res. 4 (1979), 179-185. · Zbl 0408.90082 · doi:10.1287/moor.4.2.179 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.