Markov decision processes with time-varying discount factors and random horizon. (English) Zbl 1413.93115

Summary: This paper is related to Markov decision processes. The optimal control problem is to minimize the expected total discounted cost, with a non-constant discount factor. The discount factor is time-varying and it could depend on the state and the action. Furthermore, it is considered that the horizon of the optimization problem is given by a discrete random variable, that is, a random horizon is assumed. Under general conditions on Markov control model, using the dynamic programming approach, an optimality equation for both cases is obtained, namely, finite support and infinite support of the random horizon. The obtained results are illustrated by two examples, one of them related to optimal replacement.


93E20 Optimal stochastic control
90C40 Markov and semi-Markov decision processes
90C39 Dynamic programming
Full Text: DOI Link