×

Average optimality in dynamic programming on Borel spaces – unbounded costs and controls. (English) Zbl 0771.90098

The author considers a discrete-time Markovian decision process with Borel state space, Borel action space, and unbounded rewards. He proves the existence of an average-optimal policy within a setting allowing the sets of admissible actions to be noncompact.

MSC:

90C39 Dynamic programming
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bertsekas, D. P., Dynamic Programming: Deterministic and Stochastic Models (1987), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ · Zbl 0649.93001
[2] Cavazos-Cadena, R., Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded costs, Kybernetika, 25, 145-156 (1989) · Zbl 0673.90092
[3] Duflo, M., Méthodes Récursives Aléatoires (1990), Masson: Masson Paris · Zbl 0703.62084
[4] Dynkin, E. B.; Yushkevich, A. A., Controlled Markov Processes (1979), Springer-Verlag: Springer-Verlag Berlin · Zbl 0073.34801
[5] Gatarek, D.; Stettner, L., On the compactness method in general ergodic impulsive control of Markov processes, Stochastics, 31, 15-25 (1990) · Zbl 0704.93073
[6] Hernández-Lerma, O., Adaptive Markov Control Processes (1989), Springer-Verlag: Springer-Verlag New York · Zbl 0698.90053
[7] Hernández-Lerma, O.; Lasserre, J. B., Average cost optimal policies for Markov control processes with Borel state space and unbounded costs, Systems Control Lett., 15, 349-356 (1990) · Zbl 0723.93080
[8] Hernández-Lerma, O.; Muñoz de Ozak, M., Discrete-time Markov control processes with discounted unbounded costs: optimality criteria (1990), Depto. de Matemáticas, CINVESTAV-IPN, Apdo. Postal 14-740: Depto. de Matemáticas, CINVESTAV-IPN, Apdo. Postal 14-740 México, D.F. 07000, México, Preprint · Zbl 0771.93054
[9] Rieder, U., Measurable selection theorems for optimization problems, Manuscripta Math., 24, 507-518 (1978)
[10] Royden, H. L., Real Analysis (1968), Macmillan: Macmillan New York · Zbl 0197.03501
[11] Schäl, M., Average optimality in dynamic programming with general state space, ((1990), Institut f. Angewandte Mathematik, Universität Bonn: Institut f. Angewandte Mathematik, Universität Bonn Wegelerstr. 6, D-5300 Bonn, FRG), Preprint · Zbl 0777.90079
[12] Sennott, L. I., Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs, Oper. Res., 37, 626-633 (1989) · Zbl 0675.90091
[13] Borkar, V. S., Control of Markov chains with long-run average cost criterion: the dynamic programming equations, SIAM J. Control Optim., 27, 642-657 (1989) · Zbl 0668.60059
[14] R. Cavazos-Cadena and L.I. Sennott, Comparing recent assumptions for the existence of average optimal stationary policies, Submitted.; R. Cavazos-Cadena and L.I. Sennott, Comparing recent assumptions for the existence of average optimal stationary policies, Submitted. · Zbl 0763.90092
[15] Ghosh, M. K.; Marcus, S. I., Ergodic control of Markov chains, (Proc. 29th IEEE Conference on Decision and Control. Proc. 29th IEEE Conference on Decision and Control, Honolulu, Hawaii, Dec (1990)), 258-263
[16] M.K. Ghosh and S.I. Marcus, On strong average optimality of Markov decision processes with unbounded costs, Oper. Res. Lett.; M.K. Ghosh and S.I. Marcus, On strong average optimality of Markov decision processes with unbounded costs, Oper. Res. Lett. · Zbl 0768.90085
[17] Weber, R. R.; Stidham, S. S., Optimal control of service rates in networks of queues, Adv. Appl. Probab., 19, 202-228 (1987) · Zbl 0617.60090
[18] Hernández-Lerma, O., Average optimality of controlled Markov chains with strictly unbounded costs, ((1990), Depto. de Matemáticas, CINVESTAV-IPN: Depto. de Matemáticas, CINVESTAV-IPN México), Preprint
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.