Cowan, Wesley; Katehakis, Michael N. Multi-armed bandits under general depreciation and commitment. (English) Zbl 1414.91104 Probab. Eng. Inf. Sci. 29, No. 1, 51-76 (2015). MSC: 91B06 PDFBibTeX XMLCite \textit{W. Cowan} and \textit{M. N. Katehakis}, Probab. Eng. Inf. Sci. 29, No. 1, 51--76 (2015; Zbl 1414.91104) Full Text: DOI
van der Laan, Dinard Optimal mixing of Markov decision rules for MDP control. (English) Zbl 1228.90149 Probab. Eng. Inf. Sci. 25, No. 3, 307-342 (2011). MSC: 90C40 PDFBibTeX XMLCite \textit{D. van der Laan}, Probab. Eng. Inf. Sci. 25, No. 3, 307--342 (2011; Zbl 1228.90149) Full Text: DOI