First passage problems for nonstationary discrete-time stochastic control systems. (English) Zbl 1291.93328

Summary: This paper concerns first passage problems for nonstationary nonlinear discrete-time stochastic control systems. The state and control spaces are general Borel spaces, the transition probabilities are nonstationary, the costs/rewards are time-varying and may be unbounded. The optimal control problem is to minimize the expected discounted costs incurred until the first passage time to some target set, in which the discount factors are allowed to be both time- and state-dependent. Under reasonably mild conditions we establish the so-called first passage optimality equations, and prove that the optimal cost/reward functions satisfy the optimality equations. Furthermore, from the optimality equations we show the existence of optimal Markov policies.


93E20 Optimal stochastic control
93C55 Discrete-time control/observation systems
93C10 Nonlinear systems in control theory
Full Text: DOI


[1] Bertsekas, D.P., Dynamic programming and optimal control, Vol. II, (2001), Athena Scientific Belmont, MA · Zbl 1083.90044
[2] Bertsekas, D.P.; Shreve, S.E., Stochastic optimal control: the discrete time case, (1978), Academic Press New York · Zbl 0471.93002
[3] Berument, H.; Kilinc, Z.; Ozlale, U., The effects of different inflation risk premiums on interest rate spreads, Physica A, 333, 317-324, (2004)
[4] Cheevaprawatdomrong, T.; Schochetman, I.E.; Smith, R.; Garcia, A., Solution and forecast horizons for infinite-horizon nonhomogeneous decision processes, Math oper res, 32, 51-72, (2007) · Zbl 1254.90289
[5] Derman, C., Finite state Markovian decision processes, (1970), Academic Press New York · Zbl 0262.90001
[6] Eaton, J.H.; Zadeh, L.A., Optimal pursuit strategies in discrete state probabilistic systems, J basic eng, 84, 23-29, (1962)
[7] González-Hernández, J.; López-Martínez, R.R.; Minjárez-Sosa, J.A., Adaptive policies for stochastic systems under a randomized discounted cost criterion, Bol soc mat mexicana, 14, 149-163, (2008) · Zbl 1201.93130
[8] González-Hernández, J.; López-Martínez, R.R.; Minjárez-Sosa, J.A., Approximation, estimation, and control of stochastic systems under a randomized discounted cost criterion, Kybernetika, 45, 737-754, (2009) · Zbl 1190.93105
[9] Guo, X.P.; Hernández-del-Valle, A.; Hernández-Lerma, O., Nonstationary discrete-time deterministic and stochastic control systems with infinite horizon, Int J control, 83, 1751-1757, (2010) · Zbl 1213.49035
[10] Guo, X.P.; Hernández-del-Valle, A.; Hernández-Lerma, O., Nonstationary discrete-time deterministic and stochastic control systems: bounded and unbounded cases, Syst control lett, 60, 7, 503-509, (2011) · Zbl 1222.93135
[11] Guo, X.P.; Liu, J.Y.; Liu, K., The average model of nonhomogeneous Markov decision processes with nonuniformly bounded rewards, Math oper res, 25, 667-678, (2000) · Zbl 1073.90569
[12] Guo, X.P.; Shi, P., Limiting average criteria for nonstationary Markov decision processes, SIAM J optim, 11, 1037-1053, (2001) · Zbl 1010.90092
[13] Hernández-Lerma, O., Adaptive Markov control processes, (1989), Springer-Verlag New York · Zbl 0698.90053
[14] Hernández-Lerma, O.; Lasserre, J.B., Discrete-time Markov control processes: basic optimality criteria, (1996), Springer-Verlag New York
[15] Hernández-Lerma, O.; Lasserre, J.B., Further topics on discrete-time Markov control processes, (1999), Springer-Verlag New York · Zbl 0928.93002
[16] Hinderer, K.; Waldmann, K.-H., Algorithms for countable state Markov decision models with an absorbing set, SIAM J control optim, 43, 2109-2131, (2005) · Zbl 1097.90067
[17] Huang, Y.H.; Guo, X.P., First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs, Acta math appl sin engl ser, 27, 2, 177-190, (2011) · Zbl 1235.90177
[18] Kesten, H.; Spitzer, F., Controlled Markov chains, Ann probab, 3, 32-40, (1975) · Zbl 0318.60070
[19] Kushner, H., Introduction to stochastic control, (1971), Holt, Rinehart and Winston, Inc New York · Zbl 0293.93018
[20] Liu, J.Y.; Huang, S.M., Markov decision processes with distribution function criterion of first-passage time, Appl math optim, 43, 187-201, (2001) · Zbl 1014.90110
[21] Mamabolo, R.M.; Beichelt, F.E., Maintenance policies with minimal repair, Econ qual control, 19, 143-166, (2004) · Zbl 1061.62165
[22] Newell, R.G.; Pizer, W.A., Discounting the distant future: how much do uncertain rates increase valuation?, J environ econ manage, 46, 52-71, (2003) · Zbl 1041.91502
[23] Park, Y.; Bean, J.; Smith, R., Optimal average value convergence in nonhomogeneous Markov decision processes, J math anal appl, 179, 525-536, (1993) · Zbl 0805.90115
[24] Pliska, S.R., On the transient case for Markov decision chains with general state spaces, (), 335-349
[25] Schal, M., Control of ruin probabilities by discrete-time investments, Math methods oper res, 62, 141-158, (2005) · Zbl 1101.93087
[26] Schmidli, H., Stochastic contol in insurance, (2008), Springer-Verlag London
[27] Veinott, A.F., Discrete dynamic programming with sensitive discount optimality criteria, Ann math statist, 40, 1636-1660, (1969) · Zbl 0183.49102
[28] Wei, Q.D.; Guo, X.P., Markov decision processes with statedependent factors and unbounded rewards/costs, Oper res lett, 39, 5, 367-374, (2011)
[29] Zabczyk, J., Remarks on the control of discrete-time distributed parameter systems, SIAM J control, 12, 721-735, (1974) · Zbl 0254.93027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.