Gordienko, Evgueni; Yushkevich, Alexander Stability estimates in the problem of average optimal switching of a Markov chain. (English) Zbl 1116.90401 Math. Methods Oper. Res. 57, No. 3, 345-365 (2003). Summary: We consider a switching model for a Markov chain \(x_t\) with a transition probability \(p(x|B)\). The goal of a controller is to maximize the average gain by selecting a sequence of stopping times, in which the controller gets rewards and pays costs (depending on \(x_t\)) in an alternating order. We suppose that the exact transition probability function of the original “real” chain\(x_t\) is not available to the controller, and he/she is forced to rely on a given approximation \(\widetilde p\) to the unknown \(p\). The controller finds a switching policy \(\bar\pi\) optimal for the Markov chain with the transition probability \(\widetilde p\), with a view to apply to the original Markov chain \(x_t\). Under certain restrictions on \(p\) we give an upper bound for the difference between the maximal gain attainable in switching of \(x_t\), and the gain made under the policy in the original model. The bound is expressed in terms of the total variation distance \(\sup_x\text{Var}\;(p(x|\cdot)\widetilde p(x|\cdot))\). Cited in 8 Documents MSC: 90C40 Markov and semi-Markov decision processes 60J05 Discrete-time Markov processes on general state spaces Keywords:Positive Harris recurrent chain; average reward; multiple stopping; stability index; total variation metric PDF BibTeX XML Cite \textit{E. Gordienko} and \textit{A. Yushkevich}, Math. Methods Oper. Res. 57, No. 3, 345--365 (2003; Zbl 1116.90401) Full Text: DOI