zbMATH — the first resource for mathematics

Average cost Markov control processes: Stability with respect to the Kantorovich metric. (English) Zbl 1176.60062
Perturbations of a discrete-time Markov control process are studied on a general state space. The amount of perturbation is measured by means of the Kantorovich distance. It is assumed that an average (per unit of time on the infinite horizon) optimal control policy can be found for the perturbed (supposedly known) process, and that it is used to control the original (unperturbed) process. The one-stage cost is not assumed to be bounded. Under Lyapunov-like conditions, the authors find upper bounds for the average cost excess when such an approximation is used in place of the optimal (unknown) control policy. As an application of the inequalities found, the approximation by relevant empirical distributions are considered. The results are illustrated by estimating the stability of a simple autoregressive control process. Some examples of unstable processes are also provided.

60J05 Discrete-time Markov processes on general state spaces
93E10 Estimation and detection in stochastic control theory
90C40 Markov and semi-Markov decision processes
93D99 Stability of control systems
Full Text: DOI
[1] Dynkin EB, Yushkevich AA (1979) Controlled Markov processes. Springer, New York
[2] Gordienko E (1992) An estimate of the stability of optimal control of certain stochastic and deterministic systems. J Sov Math 59: 891–899 · Zbl 1267.49026 · doi:10.1007/BF01099115
[3] Gordienko E, Hernández-Lerma O (1995) Average cost Markov control processes with weighted norms: existence of canonical policies. Appl Math 23: 199–218 · Zbl 0829.93067
[4] Gordienko E, Lemus-Rodríguez E, Montes-de-Oca R (2008) Discounted cost optimality problem: stability with respect to weak metrics. Math Methods Oper Res (in press) · Zbl 1166.60041
[5] Gordienko E, Salem F (2000) Estimates of Stability of Markov control processes with unbounded cost. Kybernetika 36: 195–210 · Zbl 1249.93176
[6] Gordienko E, Yushkevich A (2003) Stability estimates in the problem of average optimal switching of a Markov chain. Math Methods Oper Res 57: 345–365 · Zbl 1116.90401
[7] Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes. Basic Optimality Criteria. Springer, New York
[8] Montes-de-Oca R, Salem-Silva F (2005) Estimates for perturbations of average Markov decision processes with a minimal state and upper bounded by stochastically ordered Markov chains. Kybernetika 41: 757–772 · Zbl 1249.90313
[9] Rachev ST, Rüschendorf L (1998) Mass transportation problems, vol II: Applications. Springer, New York
[10] Van Dijk NM (1988) Perturbation theory for unbounded Markov reward processes with applications to queueing. Adv Appl Probab 20: 91–111 · Zbl 0642.60099
[11] Van Dijk NM, Sladky K (1999) Error bounds for nonnegative dynamic models. J Optim Theory Appl 101: 449–474 · Zbl 0946.90106 · doi:10.1023/A:1021749829267
[12] Vega-Amaya O (2003) The average cost optimality equation: a fixed point approach. Bol Soc Math Mexicana 9: 185–195 · Zbl 1176.90626
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.