zbMATH — the first resource for mathematics

Average cost Markov control processes with weighted norms: Existence of canonical policies. (English) Zbl 0829.93067
Let \((X,A,Q,c)\) be a discrete control model with Borel state space \(X\), Borel action set \(A\), transition law \(Q\), and one stage cost \(c\). Let \(\Delta\) be a class of controls and \(\Delta_0\) the subclass of deterministic, stationary policies \(f \cdot X \to A\) such that \(f(x) \subset A(x)\). The \(n\)-stage cost \(J_n (\vartheta, x)\) for policy \(\vartheta\) and initial \(y(x)\) allows to define the average cost \(J(\vartheta, x) = \limsup J_n (\vartheta, x)/n\), and \(\vartheta^*\) is an average optimal cost (AOC) if and only if \(J(\vartheta^*, x) = J^*(x) = \inf_\Delta J(\vartheta, x)\). The authors prove the existence of AOC policies and a corresponding result for the stronger concept of canonical policies. To this end, conditions are imposed on weighted norms for the cost function and the transition law to derive existence results to the AOC inequality and the AOC equality.
Reviewer: M.Kohlmann (Bonn)

93E20 Optimal stochastic control
90C40 Markov and semi-Markov decision processes
Full Text: DOI EuDML