Audibert, Jean-Yves Aggregated estimators and empirical complexity for least square regression. (English) Zbl 1052.62037 Ann. Inst. Henri Poincaré, Probab. Stat. 40, No. 6, 685-736 (2004). Summary: Numerous empirical results have shown that combining regression procedures can be a very efficient method. This work provides probably approximately correct (PAC) bounds for the \(L^2\) generalization error of such methods. The interest in these bounds is twofold. First, it gives for any aggregating procedure a bound for the expected risk depending on the empirical risk and the empirical complexity measured by the Kullback-Leibler divergence between the aggregating distribution \(\widehat\rho\) and a prior distribution \(\pi\) and by the empirical mean of the variance of the regression functions under the probability \(\widehat \rho\). Secondly, by structural risk minimization, we derive an aggregating procedure which takes advantage of the unknown properties of the best mixture \(\widetilde f\): when the best convex combination \(\widetilde f\) of \(d\) regression functions belongs to the \(d\) initial functions (i.e., when combining does not make the bias decrease), the convergence rate is of order \((\log d)/N\). In the worst case, our combining procedure achieves a convergence rate of order \(\sqrt{(\log d)/N}\) which is known to be optimal in a uniform sense when \(d>\sqrt N\). As in AdaBoost, our aggregating distribution tends to favor functions which disagree with the mixture on mispredicted points. Our algorithm is tested on artificial classification data (which have been also used for testing other boosting methods, such as AdaBoost). Cited in 20 Documents MSC: 62G08 Nonparametric regression and quantile regression 62H30 Classification and discrimination; cluster analysis (statistical aspects) 94A17 Measures of information, entropy Keywords:Deviation inequalities; Adaptive estimator; Oracle inequalities; Boosting; Bayesian expected risk bound; minimax bounds; binary classification PDFBibTeX XMLCite \textit{J.-Y. Audibert}, Ann. Inst. Henri Poincaré, Probab. Stat. 40, No. 6, 685--736 (2004; Zbl 1052.62037) Full Text: DOI Numdam Numdam EuDML