zbMATH — the first resource for mathematics

Discounted cost optimality problem: Stability with respect to weak metrics. (English) Zbl 1166.60041
In the paper, inequalities to estimate the stability (robustness) of a discounted cost optimization problem for discrete-time Markov control processes are derived on a Borel state space. The one stage cost is allowed to be unbounded. Unlike the known results in this area we consider a perturbation of transition probabilities measured by the Kantorovich metric, closely related to the weak convergence. The results obtained make possible to estimate the vanishing rate of the stability index when approximation is made through empirical measures.

60J05 Discrete-time Markov processes on general state spaces
60B05 Probability measures on topological spaces
93E15 Stochastic stability in control theory
93E20 Optimal stochastic control
Full Text: DOI
[1] Billingsley P, Topse F (1967) Uniformity in weak convergence Z. Wahsch Verw Geb 7:1–16 · Zbl 0147.15701 · doi:10.1007/BF00532093
[2] Dudley RM (1969) The speed of mean Glivenko–Cantelli convergence. Ann Math Stat 40:40–50 · Zbl 0184.41401 · doi:10.1214/aoms/1177697802
[3] Dynkin EB, Yushkevich AA (1979) Controlled Markov processes. Springer, New York
[4] Gordienko EI, Hernández-Lerma O (1995) Average cost Markov control processes with weighted norms: value iteration. Appl Math 23:219–237 · Zbl 0829.93068
[5] Gordienko EI, Salem FS (1998) Robustness inequalities for Markov control processes with unbounded cost. Syst Control Lett 33:125–130 · Zbl 0902.93068 · doi:10.1016/S0167-6911(97)00077-7
[6] Gordienko EI, Salem FS (2000) Estimates of stability of Markov control processes with unbounded costs. Kybernetika 36:195–210 · Zbl 1249.93176
[7] Gordienko EI, Yushkevich AA (2003) Stability estimates in the problem of average optimal switching of a Markov chain. Math Methods Oper Res 57:345–365 · Zbl 1116.90401
[8] Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York · Zbl 0928.93002
[9] Korn R, Korn E (2001) Option pricing and portfolio optimization. Modern methods of financial mathematics. American Mathematical Society, Providence · Zbl 0965.91020
[10] Montes-de-Oca R, Salem-Silva F (2005) Estimates for perturbations of average Markov decision process with a minimal state and upper bounded by stochastically ordered Markov chains. Kybernetika 41:757–772 · Zbl 1249.90313
[11] Montes-de-Oca R, Sakhanenko A, Salem-Silva F (2003) Estimates for perturbations of general discounted Markov control chains. Appl Math 30:287–304 · Zbl 1055.90086
[12] Rachev ST, Rüschendorf L (1998) Mass transportation problem, vol II: Applications. Springer, New York
[13] Van Dijk NM (1988) Perturbation theory for unbounded Markov reward processes with application to queueing. Adv Appl Probab 20:91–111 · Zbl 0642.60099
[14] Van Dijk NM, Sladky K (1999) Error bounds for nonnegative dynamic models. J Optim Theory Appl 101:449–474 · Zbl 0946.90106 · doi:10.1023/A:1021749829267
[15] Van Nunen JAEE, Wessels J (1978) A note on dynamic programming with unbounded rewards. Manage Sci 24:576–580 · Zbl 0374.49015 · doi:10.1287/mnsc.24.5.576
[16] Van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York · Zbl 0862.60002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.