×

Exit time risk-sensitive control for systems of cooperative agents. (English) Zbl 1422.93185

Summary: We study a sequence of many-agent exit time stochastic control problems, parameterized by the number of agents, with risk-sensitive cost structure. We identify a fully characterizing assumption, under which each such control problem corresponds to a risk-neutral stochastic control problem with additive cost, and sequentially to a risk-neutral stochastic control problem on the simplex that retains only the distribution of states of agents, while discarding further specific information about the state of each agent. Under some additional assumptions, we also prove that the sequence of value functions of these stochastic control problems converges to the value function of a deterministic control problem, which can be used for the design of nearly optimal controls for the original problem, when the number of agents is sufficiently large.

MSC:

93E20 Optimal stochastic control
91B06 Decision theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Arapostathis A, Borkar VS, Fernández-Gaucherand E, Ghosh MK, Marcus SI (1993) Discrete-time controlled markov processes with average cost criterion: a survey. SIAM J Control Optim 31(2):282-344 · Zbl 0770.93064 · doi:10.1137/0331018
[2] Avila-Godoy G, Fernández-Gaucherand E (1998) Controlled Markov chains with exponential risk-sensitive criteria: modularity, structured policies and applications. In: Proceedings of the 37th IEEE conference on decision and control, 1998, vol 1. IEEE, pp 778-783
[3] Bertsekas DP (2012) Dynamic programming and optimal control: approximate dynamic programming, vol 2. Athena Scientific, Belmont · Zbl 1298.90001
[4] Billingsley P (1995) Probability and measure, vol 3. Wiley series in probability and mathematical statistics. Wiley, New York · Zbl 0822.60002
[5] Borkar VS, Meyn SP (2002) Risk-sensitive optimal control for Markov decision processes with monotone cost. Math Oper Res 27(1):192-209 · Zbl 1082.90577 · doi:10.1287/moor.27.1.192.334
[6] Cavazos-Cadena R (2010) Optimality equations and inequalities in a class of risk-sensitive average cost Markov decision chains. Math Methods Oper Res 71(1):47-84 · Zbl 1189.93144 · doi:10.1007/s00186-009-0285-6
[7] Chung K-J, Sobel MJ (1987) Discounted MDP’s: distribution functions and exponential utility maximization. SIAM J Control Optim 25(1):49-62 · Zbl 0617.90085 · doi:10.1137/0325004
[8] Di Masi GB, Stettner Ł (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46(1):231-252 · Zbl 1141.93067 · doi:10.1137/040618631
[9] Dupuis P, McEneaney WM (1997) Risk-sensitive and robust escape criteria. SIAM J Control Optim 35(6):2021-2049 · Zbl 0889.35014 · doi:10.1137/S0363012995281626
[10] Dupuis P, James MR, Petersen IR (2000) Robust properties of risk-sensitive control. Math Control Signals Syst 13:318-332 · Zbl 0971.93081 · doi:10.1007/PL00009872
[11] Dupuis P, James M.R, Petersen I, Robust properties of risk-sensitive control. In: Proceedings of the 37th IEEE conference on decision and control (Cat. No.98CH36171), vol 2. IEEE, pp 2365-2370
[12] Dupuis P, Ramanan K, Wu W (2016) Large deviation principle for finite-state mean field interacting particle systems. arXiv:1601.06219, p 62
[13] Ethier SN, Kurtz TG (1986) Markov processes. In: SpringerReference, p 544 · Zbl 0592.60049
[14] Fleming WH, Hernández-Hernández D (1997) Risk-sensitive control of finite state machines on an infinite horizon I. SIAM J Control Optim 35(5):1790-1810 · Zbl 0891.93085 · doi:10.1137/S0363012995291622
[15] Fleming WH, Soner HM (1989) Asymptotic expansions for Markov processes with Levy generators. Appl Math Optim 19:203-223 · Zbl 0713.60085 · doi:10.1007/BF01448199
[16] Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stoch Int J Probab Stoch Process 86:37-41 · Zbl 1337.49046 · doi:10.1080/17442508.2013.872644
[17] Hernandez-Hernández D, Marcus SI (1996) Risk sensitive control of Markov processes in countable state space. Syst Control Lett 29(3):147-155 · Zbl 0866.93101 · doi:10.1016/S0167-6911(96)00051-5
[18] Hernández-Lerma O, Lasserre JB (1999) Further topics in discrete time Markov control processes. Springer, Berlin · Zbl 0928.93002 · doi:10.1007/978-1-4612-0561-6
[19] Howard RA, Matheson JE (1972) Risk-sensitive Markov decision processes. Manag Sci 18:356-369 · Zbl 0238.90007 · doi:10.1287/mnsc.18.7.356
[20] Ikeda N, Watanabe S (1989) Stochastic differential equations and diffusion processes, vol 24 of North-Holland Mathematical Library, 2nd ed. North-Holland Publishing Co., Amsterdam; Kodansha, Ltd., Tokyo · Zbl 0684.60040
[21] Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17(2):654-675 · Zbl 1128.93056 · doi:10.1214/105051606000000790
[22] Marcus SI, Fernandez-Gaucherand E, Hernandez-Hernandez D, Coraluppi SP, Fard P (1997) Risk sensitive markov decision processes. In: Systems and control in the 21st century. Birkhäuser Boston, Boston, MA, p 17 · Zbl 1065.90543
[23] Petersen IR, James MR, Dupuis P (2000) Minimax optimal control of stochastic uncertain systems with relative entropy constraints. IEEE Trans Autom Control 45(3):398-412 · Zbl 0978.93083 · doi:10.1109/9.847720
[24] Puterman, ML; Heyman, DP (ed.); Sobel, MJ (ed.), Markov decision processes (Chapter 8), No. 2 (1991), Amsterdam
[25] Sion M (1958) On general minimax theorems. Pac J Math 8:171-176 · Zbl 0081.11502 · doi:10.2140/pjm.1958.8.171
[26] Whittle P (1996) Optimal control: basics and beyond. Wiley-Interscience series in systems and optimization, John Wiley & Sons, p 464 · Zbl 0880.49001
[27] Xianping G, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin · Zbl 1209.90002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.