×

zbMATH — the first resource for mathematics

Average optimality for risk-sensitive control with general state space. (English) Zbl 1128.93056
A discrete-time Markov control process on a general state space is considered. The aim of the paper is to establish the optimality inequality for risk-sensitive dynamic programming and derive an optimal stationary policy. A similar result was obtained by Hernández-Hernández and Marcus under the assumption that there exists a stationary policy which induces a finite average cost that is equal some constant in each state. Here, instead of this assumption, the author assumes that a certain family of functions is bounded which makes the process reach “good states” sufficiently fast.
For related papers see: [D. Hernández-Hernández and S. I. Marcus, Appl. Math. Optim. 40, 273–285 (1999; Zbl 0937.90115)].

MSC:
93E20 Optimal stochastic control
60J05 Discrete-time Markov processes on general state spaces
91A15 Stochastic games, stochastic differential games
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Balaji, S. and Meyn, S. P. (2000). Multiplicative ergodicity and large deviations for an irreducible Markov chains. Stochastic Process. Appl. 90 123–144. · Zbl 1046.60065 · doi:10.1016/S0304-4149(00)00032-6
[2] Berge, E. (1963). Topological Spaces . MacMillan, New York. · Zbl 0114.38602
[3] Bielecki, T., Hernández-Hernández, D. and Pliska, S. (1999). Risk-senisitive control of finite state Markov chains in discrete time, with applications to portfolio managment. Math. Methods Oper. Res. 50 167–188. · Zbl 0959.91029 · doi:10.1007/s001860050094
[4] Bielecki, T. and Pliska, S. (1999). Risk-senisitive dynamic asset managment. Appl. Math. Optim. 39 337–360. · Zbl 0984.91047 · doi:10.1007/s002459900110
[5] Borkar, V. S. and Meyn, S. P. (2002). Risk-sensitive optimal control for Markov decision processes with monotone cost. Math. Oper. Res. 27 192–209. JSTOR: · Zbl 1082.90577 · doi:10.1287/moor.27.1.192.334 · links.jstor.org
[6] Brown, L. D. and Purves, R. (1973). Measurable selections of extrema. Ann. Statist. 1 902–912. · Zbl 0265.28003 · doi:10.1214/aos/1176342510
[7] Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Systems Control Lett. 16 387–392. · Zbl 0738.90082 · doi:10.1016/0167-6911(91)90060-R
[8] Cavazos-Cadena, R. and Fernández-Gaucherand, E. (1999). Controlled Markov chains with risk-sensitive criteria: Average cost, optimal equations and optimal solutions. Math. Methods Oper. Res. 49 299–324. · Zbl 0953.93077
[9] Dai Pra, P., Meneghini, L. and Runggaldier, W. J. (1996). Some connections between stochastic control and dynamic games. Math. Control Signals Systems 9 303–326. · Zbl 0874.93096 · doi:10.1007/BF01211853
[10] Di Masi, G. B. and Stettner, Ł. (2000). Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J. Control Optim. 38 61–78. · Zbl 0946.93043 · doi:10.1137/S0363012997320614
[11] Di Masi, G. B. and Stettner, Ł. (2000). Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Systems Control Lett. 40 15–20. · Zbl 0977.93083 · doi:10.1016/S0167-6911(99)00118-8
[12] Dupuis, P. and Ellis, R. S. (1997). A Weak Convergence Approach to the Theory of Large Deviations . Wiley, New York. · Zbl 0904.60001
[13] Filar, J. and Vrieze, K. (1997). Competitive Markov Decision Processes . Springer, New York. · Zbl 0934.91002
[14] Fleming, W. H. and Hernández-Hernández, D. (1997). Risk-sensitive control of finite state machines on an infinite horizon. SIAM J. Control Optim. 35 1790–1810. · Zbl 0891.93085 · doi:10.1137/S0363012995291622
[15] Hernández-Hernández, D. and Marcus, S. I. (1996). Risk sensitive control of Markov processes in countable state space. Systems Control Lett. 29 147–155. [Corrigendum (1998) Systems Control Lett. 34 105–106.] · Zbl 0866.93101 · doi:10.1016/S0167-6911(96)00051-5
[16] Hernández-Hernández, D. and Marcus, S. I. (1999). Existence of risk-sensitive optimal stationary policies for controlled Markov processes. Appl. Math. Optim. 40 273–285. · Zbl 0937.90115 · doi:10.1007/s002459900126
[17] Hernández-Lerma, O. and Lasserre, J. B. (1993). Discrete-Time Markov Control Process : Basic Optimality Criteria . Springer, New York. · Zbl 0781.90093
[18] Howard, R. A. and Matheson, J. E. (1972). Risk-sensitive Markov decision processes. Management Sci. 18 356–369. · Zbl 0238.90007 · doi:10.1287/mnsc.18.7.356
[19] Jacobson, D. H. (1973). Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Trans. Automat. Control 18 124–131. · Zbl 0274.93067 · doi:10.1109/TAC.1973.1100265
[20] Jaśkiewicz, A. (2006). A note on risk-sensitive control of invariant models. Technical Report, Wrocław University of Technology. · Zbl 1120.49020
[21] Jaśkiewicz, A. and Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. J. Math. Anal. Appl. 316 495–509. · Zbl 1148.90015 · doi:10.1016/j.jmaa.2005.04.065
[22] Jaśkiewicz, A. and Nowak, A. S. (2006). Zero-sum ergodic stochastic games with Feller transition probabilities. SIAM J. Control Optim. 45 773–789. · Zbl 1140.91027 · doi:10.1137/S0363012904443257
[23] Klein, E. and Thompson, A. C. (1984). Theory of Correspondences . Wiley, New York. · Zbl 0556.28012
[24] Neveu, J. (1965). Mathematical Foundations of the Calculus of Probability . Holden-Day, San Francisco, CA. · Zbl 0137.11301
[25] Royden, H. L. (1968). Real Analysis . MacMillan, New York. · Zbl 0121.05501
[26] Schäl, M. (1975). Conditions for optimality in dynamic programming and for the limit \(n\)-stage optimal policies to be optimal. Z. Wahrsch. Verw. Gebiete 32 179–196. · Zbl 0316.90080 · doi:10.1007/BF00532612
[27] Schäl, M. (1993). Average optimality in dynamic programming with general state space. Math. Oper. Res. 18 163–172. JSTOR: · Zbl 0777.90079 · doi:10.1287/moor.18.1.163 · links.jstor.org
[28] Sennott, L. I. (1999). Stochastic Dynamic Programming and the Control of Queueing Systems . Wiley, New York. · Zbl 0997.93503
[29] Serfozo, R. (1982). Convergence of Lebesgue integrals with varying measures. Sankhyã Ser. A 44 380–402. · Zbl 0568.28005
[30] Stettner, Ł. (1999). Risk sensitive portfolio optimization. Math. Methods Oper. Res. 50 463–474. · Zbl 0949.93077 · doi:10.1007/s001860050081
[31] Whittle, P. (1990). Risk-Sensitive Optimal Control . Wiley, Chichester. · Zbl 0718.93068
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.