zbMATH — the first resource for mathematics

Empirical margin distributions and bounding the generalization error of combined classifiers. (English) Zbl 1012.62004
Summary: We prove new probabilistic upper bounds on generalization errors of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margins of the combined classifiers. They are based on the methods of the theory of Gaussian and empirical processes (comparison inequalities, symmetrization method, concentration inequalities) and they improve previous results of P.L. Bartlett [IEEE Trans. Inf. Theory 44, No. 2, 525-536 (1998; Zbl 0901.68177)] on bounding the generalization error of neural networks in terms of \(\ell_1\)-norms of the weights of neurons and of R.E. Schapire, Y. Freund, P. Bartlett and W.S. Lee [Ann. Stat. 26, No. 5, 1651-1686 (1998; Zbl 0929.62069)] on bounding the generalization error of boosting. We also obtain rates of convergence in the Lévy distance of empirical margin distributions to the true margin distribution uniformly over the classes of classifiers and prove the optimality of these rates.

62B10 Statistical aspects of information-theoretic topics
62G30 Order statistics; empirical distribution functions
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62M45 Neural nets and related approaches to inference from stochastic processes
62E99 Statistical distribution theory