zbMATH — the first resource for mathematics

On early stopping in gradient descent learning. (English) Zbl 1125.62035
Summary: We study a family of gradient descent algorithms to approximate the regression function from reproducing kernel Hilbert spaces (RKHSs), the family being characterized by a polynomial decreasing rate of step sizes (or learning rate). By solving a bias-variance trade-off we obtain an early stopping rule and some probabilistic upper bounds for the convergence of the algorithms. We also discuss the implication of these results in the context of classification where some fast convergence rates can be achieved for plug-in classifiers. Some connections are addressed with boosting, Landweber iterations, and online learning algorithms as stochastic approximations of the gradient descent method.

62G08 Nonparametric regression and quantile regression
68Q32 Computational learning theory
60G40 Stopping times; optimal stopping problems; gambling theory
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62L20 Stochastic approximation
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI