Berry, Donald A.; Chen, Robert W.; Zame, Alan; Heath, David C.; Shepp, Larry A. Bandit problems with infinitely many arms. (English) Zbl 0881.62083 Ann. Stat. 25, No. 5, 2103-2116 (1997). Summary: We consider a bandit problem consisting of a sequence of \(n\) choices from an infinite number of Bernoulli arms, with \(n\to\infty\). The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution \(F\). We first assume \(F\) to be the uniform distribution on (0,1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between \(\sqrt 2/ \sqrt n\) and \(2/ \sqrt n\) and we exhibit classes of strategies that achieve the latter. Cited in 3 ReviewsCited in 5 Documents MSC: 62L05 Sequential statistical design 62C25 Compound decision problems in statistical decision theory 60F99 Limit theorems in probability theory Keywords:sequential experimentation; dynamic allocation of Bernoulli processes; staying with a winner; switching with a loser; bandit problem PDF BibTeX XML Cite \textit{D. A. Berry} et al., Ann. Stat. 25, No. 5, 2103--2116 (1997; Zbl 0881.62083) Full Text: DOI OpenURL