Wang, Siwei; Chen, Wei The pure exploration problem with general reward functions depending on full distributions. (English) Zbl 07624273 Mach. Learn. 111, No. 9, 3279-3306 (2022). MSC: 68T05 PDF BibTeX XML Cite \textit{S. Wang} and \textit{W. Chen}, Mach. Learn. 111, No. 9, 3279--3306 (2022; Zbl 07624273) Full Text: DOI arXiv OpenURL
Arora, Saurabh; Doshi, Prashant A survey of inverse reinforcement learning: challenges, methods and progress. (English) Zbl 07418628 Artif. Intell. 297, Article ID 103500, 28 p. (2021). MSC: 68Txx PDF BibTeX XML Cite \textit{S. Arora} and \textit{P. Doshi}, Artif. Intell. 297, Article ID 103500, 28 p. (2021; Zbl 07418628) Full Text: DOI arXiv OpenURL
Ding, Huaibao An SDN routing algorithm based on deep reinforcement learning. (Chinese. English summary) Zbl 1488.68001 J. Shanghai Norm. Univ., Nat. Sci. 50, No. 1, 128-132 (2021). MSC: 68M10 68T07 PDF BibTeX XML Cite \textit{H. Ding}, J. Shanghai Norm. Univ., Nat. Sci. 50, No. 1, 128--132 (2021; Zbl 1488.68001) Full Text: DOI OpenURL
Chan, Timothy C. Y.; Fernandes, Craig; Puterman, Martin L. Points gained in football: using Markov process-based value functions to assess team performance. (English) Zbl 1472.90151 Oper. Res. 69, No. 3, 877-894 (2021). MSC: 90C40 90C90 PDF BibTeX XML Cite \textit{T. C. Y. Chan} et al., Oper. Res. 69, No. 3, 877--894 (2021; Zbl 1472.90151) Full Text: DOI OpenURL
Bayramov, V. On the mathematical expectation of the renewal-reward process. (English) Zbl 1463.60110 J. Contemp. Appl. Math. 9, No. 2, 93-98 (2019). MSC: 60K05 PDF BibTeX XML Cite \textit{V. Bayramov}, J. Contemp. Appl. Math. 9, No. 2, 93--98 (2019; Zbl 1463.60110) Full Text: Link OpenURL
He, Liuliu; Yang, Yang; Li, Zheng; Zhao, Ruilian Reward of reinforcement learning of test optimization for continuous integration. (Chinese. English summary) Zbl 1438.68031 J. Softw. 30, No. 5, 1438-1449 (2019). MSC: 68N99 68T05 PDF BibTeX XML Cite \textit{L. He} et al., J. Softw. 30, No. 5, 1438--1449 (2019; Zbl 1438.68031) Full Text: DOI OpenURL
Grundel, Soesja; Borm, Peter; Hamers, Herbert Resource allocation problems with concave reward functions. (English) Zbl 1410.91311 Top 27, No. 1, 37-54 (2019). MSC: 91B32 91A12 PDF BibTeX XML Cite \textit{S. Grundel} et al., Top 27, No. 1, 37--54 (2019; Zbl 1410.91311) Full Text: DOI OpenURL
Efrosinin, Dmitry; Sztrik, Janos; Farkhadov, Mais; Stepanova, Natalia Reliability analysis of an aging unit with a controllable repair facility activation. (English) Zbl 1397.62591 Pilz, Jürgen (ed.) et al., Statistics and simulation. Contributions given at the 8th international workshop on simulation, IWS 8, Vienna, Austria, September 21–25, 2015. Cham: Springer (ISBN 978-3-319-76034-6/hbk; 978-3-319-76035-3/ebook). Springer Proceedings in Mathematics & Statistics 231, 403-417 (2018). MSC: 62N05 60J28 PDF BibTeX XML Cite \textit{D. Efrosinin} et al., Springer Proc. Math. Stat. 231, 403--417 (2018; Zbl 1397.62591) Full Text: DOI Link OpenURL
Aliyev, Rovshan; Bayramov, Veli On the asymptotic behaviour of the covariance function of the rewards of a multivariate renewal-reward process. (English) Zbl 1377.60080 Stat. Probab. Lett. 127, 138-149 (2017). MSC: 60K05 60F05 PDF BibTeX XML Cite \textit{R. Aliyev} and \textit{V. Bayramov}, Stat. Probab. Lett. 127, 138--149 (2017; Zbl 1377.60080) Full Text: DOI OpenURL
Silvestrov, Dmitrii; Li, Yanxiong Stochastic approximation methods for American type options. (English) Zbl 1354.91170 Commun. Stat., Theory Methods 45, No. 6, 1607-1631 (2016). MSC: 91G60 60J22 60G50 62L15 65C40 91G20 60G40 PDF BibTeX XML Cite \textit{D. Silvestrov} and \textit{Y. Li}, Commun. Stat., Theory Methods 45, No. 6, 1607--1631 (2016; Zbl 1354.91170) Full Text: DOI OpenURL
Fotso, Siméon; Fono, Louis Aimé On the rationality of some crisp choice functions based on strongly complete fuzzy pre-orders. (English) Zbl 1376.91050 New Math. Nat. Comput. 11, No. 1, 103-113 (2015). MSC: 91B06 91B08 PDF BibTeX XML Cite \textit{S. Fotso} and \textit{L. A. Fono}, New Math. Nat. Comput. 11, No. 1, 103--113 (2015; Zbl 1376.91050) Full Text: DOI OpenURL
Li, Quanlin; Ding, Yuanyuan; Yang, Feifei Reward processes and performance optimization in asymmetric supermarket models. (Chinese. English summary) Zbl 1349.60121 Chin. J. Appl. Probab. Stat. 31, No. 4, 411-431 (2015). MSC: 60J20 60K25 60K30 PDF BibTeX XML Cite \textit{Q. Li} et al., Chin. J. Appl. Probab. Stat. 31, No. 4, 411--431 (2015; Zbl 1349.60121) Full Text: DOI OpenURL
Zou, Xiaolong; Guo, Xianping Another set of verifiable conditions for average Markov decision processes with Borel spaces. (English) Zbl 1340.90255 Kybernetika 51, No. 2, 276-292 (2015). MSC: 90C40 93E20 PDF BibTeX XML Cite \textit{X. Zou} and \textit{X. Guo}, Kybernetika 51, No. 2, 276--292 (2015; Zbl 1340.90255) Full Text: DOI Link OpenURL
Cavazos-Cadena, Rolando; Montes-de-Oca, Raúl; Sladký, Karel A counterexample on sample-path optimality in stable Markov decision chains with the average reward criterion. (English) Zbl 1302.90241 J. Optim. Theory Appl. 163, No. 2, 674-684 (2014). MSC: 90C40 PDF BibTeX XML Cite \textit{R. Cavazos-Cadena} et al., J. Optim. Theory Appl. 163, No. 2, 674--684 (2014; Zbl 1302.90241) Full Text: DOI OpenURL
Mitov, Kosto V.; Omey, Edward Renewal processes. (English) Zbl 1300.60004 SpringerBriefs in Statistics. Cham: Springer (ISBN 978-3-319-05854-2/pbk; 978-3-319-05855-9/ebook). viii, 122 p. (2014). MSC: 60-01 60K05 60K10 60K30 60K40 26A12 PDF BibTeX XML Cite \textit{K. V. Mitov} and \textit{E. Omey}, Renewal processes. Cham: Springer (2014; Zbl 1300.60004) Full Text: DOI OpenURL
Hao, Chuanchuan; Fang, Zhou; Li, Ping An output feedback reinforcement learning control method based on a reference model. (Chinese. English summary) Zbl 1289.93071 J. Zhejiang Univ., Eng. Sci. 47, No. 3, 409-414, 479 (2013). MSC: 93C40 93B52 PDF BibTeX XML Cite \textit{C. Hao} et al., J. Zhejiang Univ., Eng. Sci. 47, No. 3, 409--414, 479 (2013; Zbl 1289.93071) OpenURL
Sheu, Yuan-Chung; Tsai, Ming-Yao On optimal stopping problems for matrix-exponential jump-diffusion processes. (English) Zbl 1252.60039 J. Appl. Probab. 49, No. 2, 531-548 (2012). Reviewer: Pavel Gapeev (London) MSC: 60G40 60J75 60G51 PDF BibTeX XML Cite \textit{Y.-C. Sheu} and \textit{M.-Y. Tsai}, J. Appl. Probab. 49, No. 2, 531--548 (2012; Zbl 1252.60039) Full Text: DOI Euclid OpenURL
Ivanov, R. V. Optimal stopping problem in a model with compensated refusal of reward. (English. Russian original) Zbl 1229.60052 Math. Notes 89, No. 2, 238-244 (2011); translation from Mat. Zametki 89, No. 2, 241-248 (2011). MSC: 60G40 91G80 PDF BibTeX XML Cite \textit{R. V. Ivanov}, Math. Notes 89, No. 2, 238--244 (2011; Zbl 1229.60052); translation from Mat. Zametki 89, No. 2, 241--248 (2011) Full Text: DOI OpenURL
Diko, Peter; Usábel, Miguel A numerical method for the expected penalty-reward function in a Markov-modulated jump-diffusion process. (English) Zbl 1218.91075 Insur. Math. Econ. 49, No. 1, 126-131 (2011). MSC: 91B30 60J70 60K10 PDF BibTeX XML Cite \textit{P. Diko} and \textit{M. Usábel}, Insur. Math. Econ. 49, No. 1, 126--131 (2011; Zbl 1218.91075) Full Text: DOI OpenURL
Khaniyev, Tahir; Atalay, Kumru Didem On the weak convergence of the ergodic distribution for an inventory model of type \((s,S)\). (English) Zbl 1220.60049 Hacet. J. Math. Stat. 39, No. 4, 599-611 (2010). MSC: 60K15 60K05 60K20 90B05 60F05 PDF BibTeX XML Cite \textit{T. Khaniyev} and \textit{K. D. Atalay}, Hacet. J. Math. Stat. 39, No. 4, 599--611 (2010; Zbl 1220.60049) OpenURL
Wang, Zhongwei; Cao, Qixin; Luan, Nan; Zhang, Lei Reactive self-rescue control for autonomous mobile robot based on reinforcement learning. (Chinese. English summary) Zbl 1212.93233 J. Shanghai Jiaotong Univ. (Chin. Ed.) 43, No. 11, 1751-1755 (2009). MSC: 93C85 68T40 68T05 PDF BibTeX XML Cite \textit{Z. Wang} et al., J. Shanghai Jiaotong Univ. (Chin. Ed.) 43, No. 11, 1751--1755 (2009; Zbl 1212.93233) OpenURL
Stefanov, Valeri T.; Ball, Frank Reward distributions associated with some block tridiagonal transition matrices with applications to identity by descent. (English) Zbl 1168.60008 Adv. Appl. Probab. 41, No. 2, 523-545 (2009). MSC: 60E10 92D10 60K15 PDF BibTeX XML Cite \textit{V. T. Stefanov} and \textit{F. Ball}, Adv. Appl. Probab. 41, No. 2, 523--545 (2009; Zbl 1168.60008) Full Text: DOI OpenURL
Tomashyk, V. V.; Mishura, Yu. S. An optimal stopping problem for a random walk with polynomial reward functions. (Ukrainian. English summary) Zbl 1199.60145 Prykl. Stat., Aktuarna Finans. Mat. 2008, No. 1-2, 101-110 (2008). MSC: 60G40 60G50 PDF BibTeX XML Cite \textit{V. V. Tomashyk} and \textit{Yu. S. Mishura}, Prykl. Stat., Aktuarna Finans. Mat. 2008, No. 1--2, 101--110 (2008; Zbl 1199.60145) OpenURL
Vilaseca, Jordi; Meseguer, Antoni; Torrent, Joan; Ferreras, Raquel Reward functions and cooperative games: characterization and economic application. (English) Zbl 1185.91038 Int. Game Theory Rev. 10, No. 2, 165-176 (2008). MSC: 91A12 91A43 91D30 PDF BibTeX XML Cite \textit{J. Vilaseca} et al., Int. Game Theory Rev. 10, No. 2, 165--176 (2008; Zbl 1185.91038) Full Text: DOI OpenURL
Stringer, S. M.; Rolls, E. T.; Taylor, P. Learning movement sequences with a delayed reward signal in a hierarchical model of motor function. (English) Zbl 1114.68056 Neural Netw. 20, No. 2, 172-181 (2007). MSC: 68T05 PDF BibTeX XML Cite \textit{S. M. Stringer} et al., Neural Netw. 20, No. 2, 172--181 (2007; Zbl 1114.68056) Full Text: DOI OpenURL
Arkin, Vadim I.; Slastnikov, Alexander D. Optimal time to invest under tax exemptions. (English) Zbl 1103.60044 Kabanov, Yuri (ed.) et al., From stochastic calculus to mathematical finance. The Shiryaev Festschrift. Allmost all papers based on the presentation at the second Bachelier colloquium on stochastic calculus and probability, Meatbief, France, January 9–15, 2005. Berlin: Springer (ISBN 3-540-30782-6/hbk). 17-32 (2006). MSC: 60G40 91B76 PDF BibTeX XML Cite \textit{V. I. Arkin} and \textit{A. D. Slastnikov}, in: From stochastic calculus to mathematical finance. The Shiryaev Festschrift. Almost all papers based on the presentation at the second Bachelier colloquium on stochastic calculus and probability, Meatbief, France, January 9--15, 2005. Berlin: Springer. 17--32 (2006; Zbl 1103.60044) OpenURL
Kurano, Masami; Yasuda, Masami; Nakagami, Jun-ichi; Yoshida, Yuji Perceptive evaluation for the optimal discounted reward in Markov decision processes. (English) Zbl 1121.68425 Torra, Vicenç (ed.) et al., Modeling decisions for artificial intelligence. Second international conference, MDAI 2005, Tsukuba, Japan, July 25–27, 2005. Proceedings. Berlin: Springer (ISBN 3-540-27871-0/pbk). Lecture Notes in Computer Science 3558. Lecture Notes in Artificial Intelligence, 283-293 (2005). MSC: 68T37 PDF BibTeX XML Cite \textit{M. Kurano} et al., Lect. Notes Comput. Sci. 3558, 283--293 (2005; Zbl 1121.68425) OpenURL
Bladt, Mogens; Meini, Beatrice; Neuts, Marcel F.; Sericola, Bruno Distributions of reward functions on continuous-time Markov chains. (English) Zbl 1015.60064 Latouche, Guy (ed.) et al., Matrix-analytic methods. Theory and applications. Proceedings of the 4th international conference, Adelaide, Australia, July 14-16, 2002. Singapore: World Scientific. 39-62 (2002). Reviewer: Thomas Simon (Évry) MSC: 60J27 PDF BibTeX XML Cite \textit{M. Bladt} et al., in: Matrix-analytic methods. Theory and applications. Proceedings of the 4th international conference, Adelaide, Australia, July 14--16, 2002. Singapore: World Scientific. 39--62 (2002; Zbl 1015.60064) OpenURL
Amir, Rabah Complementarity and diagonal dominance in discounted stochastic games. (English) Zbl 1032.91025 Ann. Oper. Res. 114, 39-56 (2002). Reviewer: Samir Kumar Neogy (New Delhi) MSC: 91A15 PDF BibTeX XML Cite \textit{R. Amir}, Ann. Oper. Res. 114, 39--56 (2002; Zbl 1032.91025) Full Text: DOI OpenURL
Cavazos-Cadena, Rolando; Montes-de-Oca, Raúl Nearly optimal policies in risk-sensitive positive dynamic programming on discrete spaces. (English) Zbl 1038.90087 Math. Methods Oper. Res. 52, No. 1, 133-167 (2000). MSC: 90C39 91B30 91B16 PDF BibTeX XML Cite \textit{R. Cavazos-Cadena} and \textit{R. Montes-de-Oca}, Math. Methods Oper. Res. 52, No. 1, 133--167 (2000; Zbl 1038.90087) Full Text: DOI OpenURL
Wu, Congbin; Lin, Yuanlie Markov fuzzy criterion decision models. (English) Zbl 0970.90111 Syst. Sci. Math. Sci. 13, No. 3, 309-320 (2000). Reviewer: Gerhard Hübner (Hamburg) MSC: 90C40 03E72 PDF BibTeX XML Cite \textit{C. Wu} and \textit{Y. Lin}, Syst. Sci. Math. Sci. 13, No. 3, 309--320 (2000; Zbl 0970.90111) OpenURL
Daduna, Hans; Knopov, Pavel S. Optimal admission control for \(M/D/1/K\) queueing systems. (English) Zbl 0972.90017 Math. Methods Oper. Res. 50, No. 1, 91-100 (1999). MSC: 90B22 90B15 60K25 PDF BibTeX XML Cite \textit{H. Daduna} and \textit{P. S. Knopov}, Math. Methods Oper. Res. 50, No. 1, 91--100 (1999; Zbl 0972.90017) Full Text: DOI OpenURL
Yoshida, Yuji A time-average fuzzy reward criterion in fuzzy decision processes. (English) Zbl 0951.90056 Inf. Sci. 110, No. 1-2, 103-112 (1998). MSC: 90C70 PDF BibTeX XML Cite \textit{Y. Yoshida}, Inf. Sci. 110, No. 1--2, 103--112 (1998; Zbl 0951.90056) Full Text: DOI OpenURL
van Dijk, Nico M. Error bounds for arbitrary approximations of “nearly reversible” Markov chains and a communications example. (English) Zbl 0914.60040 Kybernetika 33, No. 2, 171-184 (1997). Reviewer: K.Wickwire (Bedford) MSC: 60J20 PDF BibTeX XML Cite \textit{N. M. van Dijk}, Kybernetika 33, No. 2, 171--184 (1997; Zbl 0914.60040) Full Text: EuDML Link OpenURL
Kadota, Yoshinobu Notes on variance in randomized reward Markov decision processes. (English) Zbl 0886.90176 J. Inf. Optim. Sci. 18, No. 1, 121-129 (1997). MSC: 90C40 PDF BibTeX XML Cite \textit{Y. Kadota}, J. Inf. Optim. Sci. 18, No. 1, 121--129 (1997; Zbl 0886.90176) Full Text: DOI OpenURL
Mi, Jie Minimizing some cost functions related to both burn-in and field use. (English) Zbl 0864.90053 Oper. Res. 44, No. 3, 497-500 (1996). MSC: 90B25 62P30 90B30 PDF BibTeX XML Cite \textit{J. Mi}, Oper. Res. 44, No. 3, 497--500 (1996; Zbl 0864.90053) Full Text: DOI OpenURL
Kurano, Masami; Yasuda, Masami; Nakagami, Jun-ichi; Yoshida, Yuji Fuzzy decision processes with an average reward criterion. (English) Zbl 0965.90500 RIMS Kokyuroku 945, 188-198 (1996). MSC: 90C70 90C40 90B50 PDF BibTeX XML Cite \textit{M. Kurano} et al., RIMS Kokyuroku 945, 188--198 (1996; Zbl 0965.90500) OpenURL
Kurano, Masami; Yasuda, Masami; Nakagami, Jun-ichi; Yoshida, Yuji Fuzzy decision processes with an average reward criterion. (English) Zbl 0965.97001 RIMS Kokyuroku 945, 188-198 (1996). MSC: 90C70 90C40 90B50 PDF BibTeX XML Cite \textit{M. Kurano} et al., RIMS Kokyuroku 945, 188--198 (1996; Zbl 0965.97001) OpenURL
Mallubhatla, Ranga; Pattipati, Krishna R.; Viswanadham, N. Discrete-time Markov-reward models of production systems. (English) Zbl 0837.90062 Kumar, P. R. (ed.) et al., Discrete event systems, manufacturing systems, and communication networks. Based on the proceedings of a workshop that was an integral part of the 1992-93 IMA program on control theory, held at the University of Minnesota, Minneapolis, MN, USA. New York, NY: Springer-Verlag. IMA Vol. Math. Appl. 73, 149-175 (1995). MSC: 90B30 90B25 93C55 60J10 PDF BibTeX XML Cite \textit{R. Mallubhatla} et al., IMA Vol. Math. Appl. 73, 149--175 (1995; Zbl 0837.90062) OpenURL
Collins, E. J.; McNamara, J. M. The job-search problem with competition: An evolutionarily stable dynamic strategy. (English) Zbl 0772.62050 Adv. Appl. Probab. 25, No. 2, 314-333 (1993). MSC: 62L15 91A80 91A06 91A40 PDF BibTeX XML Cite \textit{E. J. Collins} and \textit{J. M. McNamara}, Adv. Appl. Probab. 25, No. 2, 314--333 (1993; Zbl 0772.62050) Full Text: DOI OpenURL
Maitra, A.; Sudderth, W. The optimal reward operator in negative dynamic programming. (English) Zbl 0773.90087 Math. Oper. Res. 17, No. 4, 921-931 (1992). MSC: 90C39 PDF BibTeX XML Cite \textit{A. Maitra} and \textit{W. Sudderth}, Math. Oper. Res. 17, No. 4, 921--931 (1992; Zbl 0773.90087) Full Text: DOI Link OpenURL
Weigel, Karin Possibilities of solution in stochastic decision models with recursive reward functions. (English) Zbl 0717.93065 Optimization 21, No. 6, 1017-1026 (1990). MSC: 93E20 90C39 PDF BibTeX XML Cite \textit{K. Weigel}, Optimization 21, No. 6, 1017--1026 (1990; Zbl 0717.93065) Full Text: DOI OpenURL
Menaldi, José Luis; Robin, Maurice On the optimal reward function of the continuous time multiarmed bandit problem. (English) Zbl 0714.90096 SIAM J. Control Optimization 28, No. 1, 97-112 (1990). Reviewer: J.L.Menaldi MSC: 90C40 60J25 93E20 35B37 90C39 PDF BibTeX XML Cite \textit{J. L. Menaldi} and \textit{M. Robin}, SIAM J. Control Optim. 28, No. 1, 97--112 (1990; Zbl 0714.90096) Full Text: DOI Link OpenURL
Horwood, Joseph W. Near-optimal rewards from multiple species harvested by several fishing fleets. (English) Zbl 0704.92021 IMA J. Math. Appl. Med. Biol. 7, No. 1, 55-68 (1990). MSC: 92D40 93E20 91B76 92-08 90C39 90C90 90B99 PDF BibTeX XML Cite \textit{J. W. Horwood}, IMA J. Math. Appl. Med. Biol. 7, No. 1, 55--68 (1990; Zbl 0704.92021) OpenURL
Øksendal, Bernt The high contact principle in optimal stopping and stochastic waves. (English) Zbl 0687.60045 Stochastic processes, Semin., San Diego/CA (USA) 1989, Prog. Probab. 18, 177-192 (1990). MSC: 60G40 60J45 PDF BibTeX XML OpenURL
Anulova, S. V.; Safonov, M. V. Control of a diffusion process in a region with fixed reflection on the boundary. (English) Zbl 0753.93080 Statistics and control of stochastic processes. Vol. 2, Pap. Steklov Semin., Moscow/USSR 1985-86, Transl. Ser. Math. Eng., 1-15 (1989). MSC: 93E20 PDF BibTeX XML Cite \textit{S. V. Anulova} and \textit{M. V. Safonov}, in: Statistics and control of stochastic processes. Vol. 2, Pap. Steklov Semin., Moscow/USSR 1985-86, Transl. Ser. Math. Eng., . 1--15 (1989; Zbl 0753.93080) OpenURL
Yasuda, Masami The optimal value of Markov stopping problems with one-step look ahead policy. (English) Zbl 0658.60071 J. Appl. Probab. 25, No. 3, 544-552 (1988). Reviewer: T.Bojdecki MSC: 60G40 PDF BibTeX XML Cite \textit{M. Yasuda}, J. Appl. Probab. 25, No. 3, 544--552 (1988; Zbl 0658.60071) Full Text: DOI OpenURL
Cavazos-Cadena, Rolando Necessary and sufficient conditions for a bounded solution to the optimality equation in average reward Markov decision chains. (English) Zbl 0645.90099 Syst. Control Lett. 10, No. 1, 71-78 (1988). Reviewer: J.Preater MSC: 90C40 PDF BibTeX XML Cite \textit{R. Cavazos-Cadena}, Syst. Control Lett. 10, No. 1, 71--78 (1988; Zbl 0645.90099) Full Text: DOI OpenURL
van Dijk, Nico M.; Puterman, Martin L. Perturbation theory for Markov reward processes with applications to queueing systems. (English) Zbl 0642.60100 Adv. Appl. Probab. 20, No. 1, 79-98 (1988). Reviewer: H.Daduna MSC: 60K30 90C47 90B22 90C31 PDF BibTeX XML Cite \textit{N. M. van Dijk} and \textit{M. L. Puterman}, Adv. Appl. Probab. 20, No. 1, 79--98 (1988; Zbl 0642.60100) Full Text: DOI OpenURL
Hernández-Lerma, Onésimo; Cavazos-Cadena, Rolando Continuous dependence of stochastic control models on the noise distribution. (English) Zbl 0639.93068 Appl. Math. Optimization 17, No. 1, 79-89 (1988). Reviewer: Sv.Gaidov MSC: 93E20 60H99 93C55 93C40 PDF BibTeX XML Cite \textit{O. Hernández-Lerma} and \textit{R. Cavazos-Cadena}, Appl. Math. Optim. 17, No. 1, 79--89 (1988; Zbl 0639.93068) Full Text: DOI OpenURL
Wu, Jishan Markov decision programming with reward function depending on time. (Chinese. English summary) Zbl 0662.90087 J., Huazhong (Cent. China) Univ. Sci. Technol. 15, No. 1, 115-122 (1987). MSC: 90C40 PDF BibTeX XML OpenURL
Ehjdukyavichyus, R. On the existence of the optimal stopping moment in the optimal stopping problem of the Markov chain with discounting. (Russian. English summary) Zbl 0649.60053 Lit. Mat. Sb. 27, No. 4, 789-792 (1987). Reviewer: T.Bojdecki MSC: 60G40 60J10 PDF BibTeX XML Cite \textit{R. Ehjdukyavichyus}, Litov. Mat. Sb. 27, No. 4, 789--792 (1987; Zbl 0649.60053) OpenURL
Hamada, Toshio A two-armed bandit problem with one arm known including switching costs and terminal rewards. (English) Zbl 0617.62083 J. Jpn. Stat. Soc. 17, 21-30 (1987). Reviewer: R.Theodorescu MSC: 62L05 62L15 90C39 90C40 60G40 PDF BibTeX XML Cite \textit{T. Hamada}, J. Jpn. Stat. Soc. 17, 21--30 (1987; Zbl 0617.62083) OpenURL
Hernandez-Lerma, O.; Marcus, S. I. Adaptive control of Markov processes with incomplete state information and unknown parameters. (English) Zbl 0585.90090 J. Optimization Theory Appl. 52, 227-241 (1987). MSC: 90C40 PDF BibTeX XML Cite \textit{O. Hernandez-Lerma} and \textit{S. I. Marcus}, J. Optim. Theory Appl. 52, 227--241 (1987; Zbl 0585.90090) Full Text: DOI OpenURL
Lin, T. F.; Yao, Y. S. Matrix inequality in distributional sense. (English) Zbl 0631.60044 Soochow J. Math. 12, 51-55 (1986). Reviewer: R.A.Horn MSC: 60G15 60E15 60H05 PDF BibTeX XML Cite \textit{T. F. Lin} and \textit{Y. S. Yao}, Soochow J. Math. 12, 51--55 (1986; Zbl 0631.60044) OpenURL
Horwood, J. W.; Whittle, P. Optimal control in the neighbourhood of an optimal equilibrium with examples from fisheries models. (English) Zbl 0612.92012 IMA J. Math. Appl. Med. Biol. 3, 129-142 (1986). MSC: 92D25 90C39 49L20 PDF BibTeX XML Cite \textit{J. W. Horwood} and \textit{P. Whittle}, IMA J. Math. Appl. Med. Biol. 3, 129--142 (1986; Zbl 0612.92012) OpenURL
White, Chelsea C. III; El-Deib, Hany K. Parameter imprecision in finite state, finite action dynamic programs. (English) Zbl 0605.90129 Oper. Res. 34, 120-129 (1986). Reviewer: K.-H.Waldmann MSC: 90C40 90C39 PDF BibTeX XML Cite \textit{C. C. White III} and \textit{H. K. El-Deib}, Oper. Res. 34, 120--129 (1986; Zbl 0605.90129) Full Text: DOI OpenURL
Mann, Elke Optimality equations and sensitive optimality in bounded Markov decision processes. (English) Zbl 0587.90099 Optimization 16, 767-781 (1985). Reviewer: A.Nowak MSC: 90C40 90C39 PDF BibTeX XML Cite \textit{E. Mann}, Optimization 16, 767--781 (1985; Zbl 0587.90099) Full Text: DOI OpenURL
Ştefănescu, A.; Ştefănescu, Maria V. General stochastic games. (English) Zbl 0598.90101 Probability theory, Proc. 7th Conf., Braşov/Rom. 1982, 643-647 (1984). Reviewer: Y.Ohtsubo MSC: 91A15 91A10 91A60 PDF BibTeX XML OpenURL
van Dawen, Rolf Negative dynamic programming. (English) Zbl 0531.90094 Operations research, Proc. 12th Annu. Meet., Mannheim 1983, 475-478 (1984). MSC: 90C39 PDF BibTeX XML OpenURL
Dochviri, V. M. On the convergence of costs in the case of approximation of the continuous Kalman-Bucy scheme by discrete schemes. (Russian. English summary) Zbl 0574.60054 Tr. Tbilis. Univ. 239, Mat. Mekh. Astron. 15, 65-76 (1983). MSC: 60G35 93E11 PDF BibTeX XML OpenURL
Assaf, David Extreme-point solutions in Markov decision processes. (English) Zbl 0544.90098 J. Appl. Probab. 20, 835-842 (1983). Reviewer: G.Hübner MSC: 90C40 PDF BibTeX XML Cite \textit{D. Assaf}, J. Appl. Probab. 20, 835--842 (1983; Zbl 0544.90098) Full Text: DOI OpenURL
Dietz, Hans Michael; Nollau, Volker Markov decision problems with countable state spaces. Optimality criteria - algorithms - clustering. (English) Zbl 0543.90078 Mathematical Research, 15. Berlin: Akademie-Verlag. 174 p. DDR M 22.00 (1983). Reviewer: M.Schäl MSC: 90C40 90-02 PDF BibTeX XML OpenURL
Bojdecki, Tomasz A method of maximizing probabilities in sequential problems. (Polish) Zbl 0524.60046 Ann. Soc. Math. Pol., Ser. III, Mat. Stosow. 21, 5-37 (1982). MSC: 60G40 62L15 60J10 PDF BibTeX XML Cite \textit{T. Bojdecki}, Ann. Soc. Math. Pol., Ser. III, Mat. Stosow. 21, 5--37 (1982; Zbl 0524.60046) OpenURL
Kolonko, Michael The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter. (English) Zbl 0518.90092 Math. Operationsforsch. Stat., Ser. Optimization 13, 567-591 (1982). MSC: 90C40 60K20 PDF BibTeX XML Cite \textit{M. Kolonko}, Math. Operationsforsch. Stat., Ser. Optimization 13, 567--591 (1982; Zbl 0518.90092) Full Text: DOI OpenURL
Yushkevich, A. A. On semi-Markov controlled models with an average reward criterion. (English) Zbl 0499.60094 Theory Probab. Appl. 26, 796-803 (1982). MSC: 60K15 90C40 PDF BibTeX XML Cite \textit{A. A. Yushkevich}, Theory Probab. Appl. 26, 796--803 (1982; Zbl 0499.60094) Full Text: DOI OpenURL
McNamara, John Optimal patch use in a stochastic environment. (English) Zbl 0478.92017 Theor. Popul. Biol. 21, 269-288 (1982). MSC: 92D40 92D25 62P10 62L15 PDF BibTeX XML Cite \textit{J. McNamara}, Theor. Popul. Biol. 21, 269--288 (1982; Zbl 0478.92017) Full Text: DOI OpenURL
Yushkevich, A. A. On the semi-Markov controlled models with the average reward criterion. (Russian) Zbl 0478.60091 Teor. Veroyatn. Primen. 26, 808-815 (1981). MSC: 60K15 90C40 PDF BibTeX XML Cite \textit{A. A. Yushkevich}, Teor. Veroyatn. Primen. 26, 808--815 (1981; Zbl 0478.60091) OpenURL
Bismut, Jean-Michel Convex inequalities in stochastic control. (English) Zbl 0474.93071 J. Funct. Anal. 42, 226-270 (1981). MSC: 93E20 60J45 90C25 60G40 60J60 PDF BibTeX XML Cite \textit{J.-M. Bismut}, J. Funct. Anal. 42, 226--270 (1981; Zbl 0474.93071) Full Text: DOI OpenURL
Evans, Richard V. Markov chain design problems. (English) Zbl 0467.90030 Oper. Res. 29, 959-970 (1981). MSC: 90B22 60K25 60J20 60J10 90C90 90C40 PDF BibTeX XML Cite \textit{R. V. Evans}, Oper. Res. 29, 959--970 (1981; Zbl 0467.90030) Full Text: DOI OpenURL
Stidham, S. jun. On the convergence of successive approximations in dynamic programming with non-zero terminal reward. (English) Zbl 0454.90087 Z. Oper. Res., Ser. A 25, 57-77 (1981). MSC: 90C39 PDF BibTeX XML Cite \textit{S. Stidham jun.}, Z. Oper. Res., Ser. A 25, 57--77 (1981; Zbl 0454.90087) Full Text: DOI OpenURL
van der Wal, Johannes Stochastic dynamic programming. Successive approximations and nearly optimal strategies for Markov Decision Processes and Markov Games. (English) Zbl 0443.90055 Proefschrift, Technische Hogeschool Eindhoven. Amsterdam: Mathematisch Centrum. XI, 253 p. (1980). MSC: 90C40 91A05 90C39 PDF BibTeX XML OpenURL
Yasuda, Masami Semi-Markov decision processes with countable state space and compact action space. (English) Zbl 0396.62068 Bull. Math. Stat. 18, No. 1-2, 35-54 (1978). MSC: 62M99 90C40 PDF BibTeX XML Cite \textit{M. Yasuda}, Bull. Math. Stat. 18, No. 1--2, 35--54 (1978; Zbl 0396.62068) OpenURL