## Bandit problems with infinitely many arms.(English)Zbl 0881.62083

Summary: We consider a bandit problem consisting of a sequence of $$n$$ choices from an infinite number of Bernoulli arms, with $$n\to\infty$$. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution $$F$$. We first assume $$F$$ to be the uniform distribution on (0,1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between $$\sqrt 2/ \sqrt n$$ and $$2/ \sqrt n$$ and we exhibit classes of strategies that achieve the latter.

### MSC:

 62L05 Sequential statistical design 62C25 Compound decision problems in statistical decision theory 60F99 Limit theorems in probability theory
Full Text: