zbMATH — the first resource for mathematics

Super learner. (English) Zbl 1166.62387
Summary: When trying to learn a model for the prediction of an outcome given a set of covariates, a statistician has many estimation procedures in their toolbox. A few examples of these candidate learners are: least squares, least angle regression, random forests, and spline regression. Previous articles theoretically validated the use of cross validation to select an optimal learner among many candidate learners. Motivated by this use of cross validation, we propose a new prediction method for creating a weighted combination of many candidate learners to build the super learner. This article proposes a fast algorithm for constructing a super learner in prediction which uses V-fold cross-validation to select weights to combine an initial set of candidate learners. In addition, this paper contains a practical demonstration of the adaptivity of this so called super learner to various true data generating distributions. This approach for construction of a super learner generalizes to any parameter which can be defined as a minimizer of a loss function.

62P99 Applications of statistics
68T35 Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI Link