zbMATH — the first resource for mathematics

Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains. (English) Zbl 1189.93144
Summary: This note concerns controlled Markov chains on a denumerable state space. The performance of a control policy is measured by the risk-sensitive average criterion, and it is assumed that (a) the simultaneous Doeblin condition holds, and (b) the system is communicating under the action of each stationary policy. If the cost function is bounded below, it is established that the optimal average cost is characterized by an optimality inequality, and it is to shown that, even for bounded costs, such an inequality may be strict at every state. Also, for a nonnegative cost function with compact support, the existence and uniqueness of bounded solutions of the optimality equation is proved, and an example is provided to show that such a conclusion generally fails when the cost is negative at some state.

93E20 Optimal stochastic control
60J05 Discrete-time Markov processes on general state spaces
93C55 Discrete-time control/observation systems
49K45 Optimality conditions for problems involving randomness
Full Text: DOI
[1] Arapostathis A, Borkar VK, Fernández-Gaucherand E, Gosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criteria: a survey. SIAM J Control Optim 31: 282–334 · Zbl 0770.93064 · doi:10.1137/0331018
[2] Borkar VS, Meyn SP (2002) Risk-sensitive optimal control for Markov decison process with monotone cost. Math Oper Res 27: 192–209 · Zbl 1082.90577 · doi:10.1287/moor.
[3] Cavazos-Cadena R (1988) Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward markov decision chains. Syst Control Lett 10: 71–78 · Zbl 0645.90099 · doi:10.1016/0167-6911(88)90043-6
[4] Cavazos-Cadena R, Fernández-Gaucherand E (1999) Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions. Math Meth Oper Res 43: 121–139 · Zbl 0953.93077
[5] Cavazos-Cadena R, Fernández-Gaucherand E (2002) Risk-sensitive control in communicating average Markov decision chains. In: Dror M, L’Ecuyer P, Szidarovsky F (eds) Modelling uncertainty: an examination of stochastic theory, methods and applications. Kluwer, Boston, pp 525–544
[6] Cavazos-Cadena R, Hernández-Hernández D (2004) A characterization of exponential functionals in finite Markov chains. Math Methods Oper Res 60: 399–414 · Zbl 1072.60053 · doi:10.1007/s001860400373
[7] Di Masi GB, Stettner L (2000) Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Syst Control Lett 40: 305–321 · Zbl 0977.93083 · doi:10.1016/S0167-6911(99)00118-8
[8] Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization properrty. SIAM J Control Optim 46: 231–252 · Zbl 1141.93067 · doi:10.1137/040618631
[9] Fleming WH, McEneany WM (1995) Risk-sensitive control on an infinite horizon. SIAM J Control Optim 33: 1881–1915 · Zbl 0949.93079 · doi:10.1137/S0363012993258720
[10] Hernández-Hernández D, Marcus SI (1996) Risk-sensitive control of Markov processes in countable state space. Syst Control Lett 29: 147–155 · Zbl 0866.93101 · doi:10.1016/S0167-6911(96)00051-5
[11] Hernández-Hernández D, Marcus SI (1999) Existence of risk-sensitive optimal stationary policies for controlled Markov processes. Appl Math Optim 40: 273–285 · Zbl 0937.90115 · doi:10.1007/s002459900126
[12] Hernández-Lerma O (1988) Adaptive Markov control processes. Springer, New York · Zbl 0646.90090
[13] Howard AR, Matheson JE (1972) Risk-sensitive Markov decision processes. Manage Sci 18: 356–369 · Zbl 0238.90007 · doi:10.1287/mnsc.18.7.356
[14] Jacobson DH (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to stochastic differential games. IEEE Trans Automat Contr 18: 124–131 · Zbl 0274.93067 · doi:10.1109/TAC.1973.1100265
[15] Jaquette SC (1973) Markov decison processes with a new optimality criterion: discrete time. Ann Stat 1: 496–505 · Zbl 0259.90054 · doi:10.1214/aos/1176342415
[16] Jaquette SC (1976) A utility criterion for Markov decision processes. Manage Sci 23: 43–49 · Zbl 0337.90053 · doi:10.1287/mnsc.23.1.43
[17] Jaśkiewicz A (2007) Average optimality for risk sensitive control with general state space. Ann Appl Probab 17: 654–675 · Zbl 1128.93056 · doi:10.1214/105051606000000790
[18] Loève M (1980) Probability theory I. Springer, New York · Zbl 0447.62003
[19] Puterman ML (1994) Markov decision processes. Wiley, New York · Zbl 0829.90134
[20] Seneta E (1980) Nonnegative matrices. Springer, New York · Zbl 0537.15012
[21] Sennot L (1986) A new condition for the existence of optimum stationary policies in average cost Maarkov decision processes. Oper Res Lett 5: 17–23 · Zbl 0593.90083 · doi:10.1016/0167-6377(86)90095-7
[22] Sennot L (1995) Another set of conditions for average optimality in Maarkov control processes. Syst Control Lett 24: 147–151 · Zbl 0877.93135 · doi:10.1016/0167-6911(93)E0158-D
[23] Thomas LC (1980) Conectedness conditions for denumerable state Markov decision processes. In: Hartley R, Thomas LC, White DJ (eds) Recent advances in Markov decision processes. Academic Press, New York
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.