Whittle, P. Arm-acquiring bandits. (English) Zbl 0464.90081 Ann. Probab. 9, 284-292 (1981). Page: −5 −4 −3 −2 −1 ±0 +1 +2 +3 +4 +5 Show Scanned Page Cited in 27 Documents MSC: 90C39 Dynamic programming 62C99 Statistical decision theory 42C99 Nontrigonometric harmonic analysis 90C40 Markov and semi-Markov decision processes Keywords:multiarmed bandit; allocation index; bandit problem; optimal expected total discounted reward; Gittings index policy; optimality of policies PDF BibTeX XML Cite \textit{P. Whittle}, Ann. Probab. 9, 284--292 (1981; Zbl 0464.90081) Full Text: DOI OpenURL