Lamberton, Damien; Pagès, Gilles A penalized bandit algorithm. (English) Zbl 1206.62139 Electron. J. Probab. 13, 341-373 (2008). Summary: We study a two armed-bandit recursive algorithm with penalty. We show that the algorithm converges towards its “target” although it always has a noiseless “trap”. Then, we elucidate the rate of convergence. For some choices of the parameters, we obtain a central limit theorem in which the limit distribution is characterized as the unique stationary distribution of a Markov process with jumps. Cited in 6 Documents MSC: 62L05 Sequential statistical design 62L20 Stochastic approximation 60F05 Central limit and other weak theorems 65C60 Computational problems in statistics (MSC2010) Keywords:two-armed bandit algorithm; penalization; convergence rate; learning automata × Cite Format Result Cite Review PDF Full Text: DOI arXiv EuDML EMIS