Markov decision processes with exponentially representable discounting.(English)Zbl 1154.90610

Summary: We generalize the geometric discount of finite discounted cost Markov Decision Processes to “exponentially representable”discount functions, prove existence of optimal policies which are stationary from some time $$N$$ onward, and provide an algorithm for their computation. Outside this class, optimal “$$N$$-stationary” policies in general do not exist.

MSC:

 90C40 Markov and semi-Markov decision processes 60K20 Applications of Markov renewal processes (reliability, queueing networks, etc.)
Full Text:

References:

 [1] Puterman, M.L., Markov decision processes: discrete stochastic dynamic programming, (1994), John Wiley & Sons NY · Zbl 0829.90134 [2] Loewenstein, G.; Prelec, D., Anomalies in intertemporal choice: evidence and an interpretation, The quarterly journal of economics, 107, 2, 573-597, (1992) [3] Rubinstein, A., Economics and psychology? the case of hyperbolic discounting, International economic review, 44, 4, 1207-1216, (2003) [4] Laibson, D., Golden eggs and hyperbolic discounting, The quarterly journal of economics, 112, 2, 443-477, (1997) · Zbl 0882.90024 [5] Feinberg, E.A.; Shwartz, A., Markov decision models with weighted discounted criteria, Mathematics of operations research, 19, 1, 152-168, (1994) · Zbl 0803.90123 [6] Y. Carmon, A. Shwartz, Eventually-stationary policies for Markov decision models with non-constant discounting. Internal Report, Technion 2008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.