zbMATH — the first resource for mathematics

Estimates of stability of Markov control processes with unbounded costs. (English) Zbl 1249.93176
Summary: For a discrete-time Markov control process with the transition probability \(p\), we compare the total discounted costs \(V_{\beta }\) \((\pi _{\beta })\) and \(V_{\beta }(\tilde {\pi }_{\beta })\), when applying the optimal control policy \(\pi _{\beta }\) and its approximation \(\tilde {\pi }_{\beta }\). The policy \(\tilde {\pi }_{\beta }\) is optimal for an approximating process with the transition probability \(\tilde {p}\). A cost per stage for considered processes can be unbounded. Under certain ergodicity assumptions we establish the upper bound for the relative stability index \([V_{\beta }(\tilde {\pi }_{\beta })-V_{\beta }(\pi _{\beta })]/V_{\beta }(\pi _{\beta })\). This bound does not depend on a discount factor \(\beta \in (0,1)\) and this is given in terms of the total variation distance between \(p\) and \(\tilde {p}\).

93E20 Optimal stochastic control
93C55 Discrete-time control/observation systems
Full Text: Link EuDML
[1] Dynkin E. B., Yushkevich A. A.: Controlled Markov Processes. Springer-Verlag, New York 1979 · Zbl 0073.34801
[2] Gordienko E., Hernández-Lerma O.: Average cost Markov control processes with weighted norms: exitence of canonical policies. Appl. Math. 23 (1995), 199-218 · Zbl 0829.93067 · eudml:219126
[3] Gordienko E., Hernández-Lerma O.: Average cost Markov control processes with weighted norms: value iteration. Appl. Math. 23 (1995), 219-237 · Zbl 0829.93068 · eudml:219127
[4] Gordienko E. I., Isauro-Martínez M. E., Carrillo R. M. Marcos: Estimation of stability in controlled storage systems. Research Report No. 04.0405.I.01.001.97, Dep. de Matemáticas, Universidad Autónoma Metropolitana, México 1997
[5] Gordienko E. I., Salem F. S.: Robustness inequality for Markov control processes with unbounded costs. Systems Control Lett. 33 (1998), 125-130 · Zbl 0902.93068 · doi:10.1016/S0167-6911(97)00077-7
[6] Hernández-Lerma O., Lasserre J. B.: Average cost optimal policies for Markov control processes with Borel state space and unbounded costs. Systems Control Lett. 15 (1990), 349-356 · Zbl 0723.93080 · doi:10.1016/0167-6911(90)90108-7
[7] Hernández-Lerma O., Lassere J. B.: Discrete-time Markov Control Processes. Springer-Verlag, New York 1995
[8] Hinderer H.: Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter. (Lecture Notes in Operations Research 33.) Springer-Verlag, New York 1970 · Zbl 0202.18401
[9] Kartashov N. V.: Inequalities in theorems of ergodicity and stability for Markov chains with common phase space. II. Theory Probab. Appl. 30 (1985), 507-515 · Zbl 0619.60066 · doi:10.1137/1130063
[10] Kumar P. R., Varaiya P.: Stochastic Systems: Estimation, Identification and Adaptive Control. Prentice-Hall, Englewood Cliffs, N. J. 1986 · Zbl 0706.93057
[11] Meyn S. P., Tweedie R. L.: Markov Chains and Stochastic Stability. Springer-Verlag, Berlin 1993 · Zbl 1165.60001 · doi:10.1017/CBO9780511626630
[12] Nummelin E.: General Irreducible Markov Chains and Non-Negative Operators. Cambridge University Press, Cambridge 1984 · Zbl 0551.60066 · doi:10.1017/CBO9780511526237
[13] Rachev S. T.: Probability Metrics and the Stability of Stochastic Models. Wiley, New York 1991 · Zbl 0744.60004
[14] Scott D. J., Tweedie R. L.: Explicit rates of convergence of stochastically ordered Markov chains. Proc. Athens Conference of Applied Probability and Time Series Analysis: Papers in Honour of J. M. Gani and E. J. Hannan (C. C. Heyde, Yu. V. Prohorov, R. Pyke and S. T. Rachev. Springer-Verlag, New York 1995, pp. 176-191 · Zbl 0858.60060
[15] Dijk N. M. Van: Perturbation theory for unbounded Markov reward processes with applications to queueing. Adv. in Appl. Probab. 20 (1988), 99-111 · Zbl 0642.60099 · doi:10.2307/1427272
[16] Dijk N. M. Van, Puterman M. L.: Perturbation theory for Markov reward processes with applications to queueing systems. Adv. in Appl. Probab. 20 (1988), 79-98 · Zbl 0642.60100 · doi:10.2307/1427271
[17] Weber R. R., jr. S. Stidham: Optimal control of service rates in networks of queues. Adv. in Appl. Probab. 19 (1987), 202-218 · Zbl 0617.60090 · doi:10.2307/1427380
[18] Whitt W.: Approximations of dynamic programs I. Math. Oper. Res. 3 (1978), 231-243 · Zbl 0393.90094 · doi:10.1287/moor.3.3.231
[19] Zolotarev V. M.: On stochastic continuity of queueing systems of type \(G| G| 1\). Theory Probab. Appl. 21 (1976), 250-269 · Zbl 0363.60090 · doi:10.1137/1121032
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.