×

zbMATH — the first resource for mathematics

On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. (English) Zbl 1198.62048
Summary: Let \(X_1, \dots , X_n\) be identically distributed random vectors in \(\mathbb R^d\), independently drawn according to some probability density. An observation \(X_i\) is said to be a layered nearest neighbour (LNN) of a point \(x\) if the hyperrectangle defined by \(x\) and \(X_i\) contains no other data points. We first establish consistency results on \(L_n(x)\), the number of LNN of \(x\). Then, given a sample \((X, Y), (X_1, Y_1), \dots , (X_n, Y_n)\) of independent identically distributed random vectors from \(\mathbb R^d \times \mathbb R\), one may estimate the regression function \(r(x) = \mathbb E[Y|X = x]\) by the LNN estimate \(r_n(x)\), defined as an average over the \(Y_i\)’s corresponding to those \(X_i\) which are LNN of \(x\). Under mild conditions on \(r\), we establish the consistency of \(\mathbb E|r_n(x) -r(x)|^p\) towards 0 as \(n \rightarrow \infty \), for almost all \(x\) and all \(p\geq 1\), and discuss the links between \(r_n\) and the random forest estimates of L. Breiman [Mach. Learn. 45, No. 1, 5–32 (2001; Zbl 1007.68152)]. We finally show the universal consistency of the bagged (bootstrap-aggregated) nearest neighbour method for regression and classification.

MSC:
62H12 Estimation in multivariate analysis
62G20 Asymptotic properties of nonparametric inference
62G08 Nonparametric regression and quantile regression
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Amit, Y.; Geman, D., Shape quantization and recognition with randomized trees, Neural computation, 9, 1545-1588, (1997)
[2] Bai, Z.-D.; Chao, C.-C.; Hwang, H.-K.; Liang, W.-Q., On the variance of the number of maxima in random vectors and its applications, The annals of applied probability, 8, 886-895, (1998) · Zbl 0941.60021
[3] Bai, Z.-D.; Devroye, L.; Hwang, H.-K.; Tsai, T.-S., Maxima in hypercubes, Random structures and algorithms, 27, 290-309, (2005) · Zbl 1080.60007
[4] Barndorff-Nielsen, O.; Sobel, M., On the distribution of admissible points in a vector random sample, Theory of probability and its applications, 11, 249-269, (1966) · Zbl 0278.60007
[5] Biau, G.; Devroye, L.; Lugosi, G., Consistency of random forests and other averaging classifiers, Journal of machine learning research, 9, 2015-2033, (2008) · Zbl 1225.62081
[6] Breiman, L., Bagging predictors, Machine learning, 24, 123-140, (1996) · Zbl 0858.68080
[7] L. Breiman, Some infinite theory for predictor ensembles, Technical Report 577, Statistics Department, UC Berkeley, 2000. http://www.stat.berkeley.edu/ breiman.
[8] Breiman, L., Random forests, Machine learning, 45, 5-32, (2001) · Zbl 1007.68152
[9] L. Breiman, Consistency for a simple model of random forests, Technical Report 670, Statistics Department, UC Berkeley, 2004. http://www.stat.berkeley.edu/ breiman.
[10] Cover, T.M.; Hart, P.E., Nearest neighbour pattern classification, IEEE transactions on information theory, 13, 21-27, (1967) · Zbl 0154.44505
[11] Cutler, A.; Zhao, G., Pert—perfect random tree ensembles, Computing science and statistics, 33, 490-497, (2001)
[12] Devroye, L., The uniform convergence of nearest neighbour regression function estimators and their application in optimization, IEEE transactions on information theory, 24, 142-151, (1978) · Zbl 0375.62083
[13] Devroye, L.; Györfi, L.; Lugosi, G., A probabilistic theory of pattern recognition, (1996), Springer-Verlag New York · Zbl 0853.68150
[14] Dietterich, T.G., An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Machine learning, 40, 139-157, (2000)
[15] Doeblin, W., Exposé de la théorie des chaînes simples constantes de Markov à un nombre fini d’états, Revue mathématique de l’union interbalkanique, 2, 77-105, (1937)
[16] E. Fix, J. Hodges, Discriminatory analysis. Nonparametric discrimination: Consistency properties, Technical Report 4, Project Number 21-49-004, USAF School of Aviation Medicine, Randolph Field, Texas, 1951. · Zbl 0715.62080
[17] Györfi, L.; Kohler, M.; Krzyżak, A.; Walk, H., A distribution-free theory of nonparametric regression, (2002), Springer-Verlag New York · Zbl 1021.62024
[18] Lin, Y.; Jeon, Y., Random forests and adaptive nearest neighbors, Journal of the American statistical association, 101, 578-590, (2006) · Zbl 1119.62304
[19] Marcinkiewicz, J.; Zygmund, A., Sur LES fonctions indépendantes, Fundamenta mathematicae, 29, 60-90, (1937) · JFM 63.0946.02
[20] Petrov, V.V., Sums of independent random variables, (1975), Springer-Verlag Berlin · Zbl 0322.60043
[21] Rachev, S.T.; Rüschendorf, L., Mass transportation problems, volume I: theory, (1998), Springer New York · Zbl 0990.60500
[22] Steele, B.M., Exact bootstrap \(k\)-nearest neighbor learners, Machine learning, 74, 235-255, (2009)
[23] Stone, C.J., Consistent nonparametric regression, The annals of statistics, 5, 595-645, (1977) · Zbl 0366.62051
[24] Wheeden, R.L.; Zygmund, A., Measure and integral. an introduction to real analysis, (1977), Marcel Dekker New York · Zbl 0362.26004
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.