×

Markov decision processes with variance minimization: A new condition and approach. (English) Zbl 1152.90646

Summary: This article deals with the limiting average variance criterion for discrete-time Markov decision processes in Borel spaces. The costs may have neither upper nor lower bounds. We propose another set of conditions under which we prove the existence of a variance minimal policy in the class of average expected cost optimal stationary policies. Our conditions are weaker than those in the previous literature. Moreover, some sufficient conditions for the existence of a variance minimal policy are imposed on the primitive data of the model. In particular, the stochastic monotonicity condition in this paper has been first used to study the limiting average variance criterion. Also, the optimality inequality approach provided here is different from the ”optimality equation approach” widely used in the previous literature. Finally, we use a controlled queueing system to illustrate our results.

MSC:

90C40 Markov and semi-Markov decision processes
93E20 Optimal stochastic control
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] DOI: 10.1137/0331018 · Zbl 0770.93064
[2] Borkar V.S., Topics in Controlled Markov Chains (1991) · Zbl 0725.93082
[3] Doob J.L., Measure Theory (1994)
[4] Dynkin E.B., Controlled Markov Processes (1979)
[5] DOI: 10.1287/moor.14.1.147 · Zbl 0676.90096
[6] Gordienko E., Appl. Math. 23 pp 219– (1995)
[7] Guo X.P., Math. Meth. Oper. Res. 49 pp 87– (1999)
[8] DOI: 10.1239/jap/1152413725 · Zbl 1121.90122
[9] DOI: 10.1007/978-1-4419-8714-3 · Zbl 0698.90053
[10] Hernández-Lerma O., Further Topics on Discrete-Time Markov Control Processes. (1999) · Zbl 0928.93002
[11] Hernández-Lerma O., Discrete-Time Markov Control Processes: Basic Optimality Criteria. (1996)
[12] DOI: 10.1137/S0363012998340673 · Zbl 0951.93074
[13] Hou Z.T., Markov Decision Processes (1998)
[14] DOI: 10.1016/0022-247X(87)90332-5 · Zbl 0619.90080
[15] Lund R.B., Math. Meth. Oper. Res. 20 pp 182– (1996) · Zbl 0847.60053
[16] Mandl P., Kybernetika 7 pp 1– (1971)
[17] Mandl P., Kybernetika 9 pp 237– (1973)
[18] Meyn S.P., Markov Chains and Stochastic Stability (1993) · Zbl 0925.60001
[19] DOI: 10.1002/9780470316887
[20] Rolski T., Stochastic Processes for Insurance and Finance (1998)
[21] Scott D.J., Proceedings of the Athens Conference on Applied Probability and Time Series Analysis pp 176– (1995)
[22] Sennott L.I., Handbook of Markov Decision Processes. (2002)
[23] Sennott L.I., Stochastic Dynamic Programming and the Control of Queueing Systems (1999) · Zbl 0997.93503
[24] DOI: 10.1007/s001860400408 · Zbl 1114.90144
[25] DOI: 10.1080/07362990500184865 · Zbl 1160.90686
[26] DOI: 10.1016/j.jmaa.2006.02.050 · Zbl 1124.90044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.