an:06117415
Zbl 1254.62050
Genuer, Robin
Variance reduction in purely random forests
EN
J. Nonparametric Stat. 24, No. 3, 543-562 (2012).
00312172
2012
j
62G08 68T05 62G20 65C60
nonparametric regression; rates of convergence; randomisation; ensemble methods
Summary: Random forests (RFs), introduced by \textit{L. Breiman} [Mach. Learn. 45, No. 1, 5--32 (2001; Zbl 1007.68152)], are a very effective statistical method. The complex mechanism of the method makes the theoretical analysis difficult. Therefore, simplified versions of RF, called purely RFs (PRFs), which can be theoretically handled more easily, have been considered. We study the variance of such forests. First, we show a general upper bound which emphasises the fact that a forest reduces the variance. We then introduce a simple variant of PRFs, that we call purely uniformly RFs. For this variant and in the context of regression problems with a one-dimensional predictor space, we show that both random trees and RFs reach the minimax rate of convergence. In addition, we prove that compared with random trees, RFs improve accuracy by reducing the estimator variance by a factor of three-fourths.
Zbl 1007.68152