zbMATH — the first resource for mathematics

Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space. (English) Zbl 1023.90076
Summary: This work concerns discrete-time Markov decision processes with finite state space and bounded costs per stage. The decision maker ranks random costs via the expectation of the utility function associated to a constant risk sensitivity coefficient, and the performance of a control policy is measured by the corresponding (long-run) risk-sensitive average cost criterion. The main structural restriction on the system is the following communication assumption: For every pair of states \(x\) and \(y\), there exists a policy \(\pi\), possibly depending on \(x\) and \(y\), such that when the system evolves under \(\pi\) starting at \(x\), the probability of reaching \(y\) is positive. Within this framework, the paper establishes the existence of solutions to the optimality equation whenever the constant risk sensitivity coefficient does not exceed certain positive value.

90C40 Markov and semi-Markov decision processes
93E20 Optimal stochastic control
60J05 Discrete-time Markov processes on general state spaces
Full Text: DOI