×

zbMATH — the first resource for mathematics

Consistency of random forests. (English) Zbl 1317.62028
Summary: Random forests are a learning algorithm proposed by L. Breiman [Mach. Learn. 45, No. 1, 5–32 (2001; Zbl 1007.68152)] that combines several randomized decision trees and aggregates their predictions by averaging. Despite its wide usage and outstanding practical performance, little is known about the mathematical properties of the procedure. This disparity between theory and practice originates in the difficulty to simultaneously analyze both the randomization process and the highly data-dependent tree structure. In the present paper, we take a step forward in forest exploration by proving a consistency result for Breiman’s [loc. cit.] original algorithm in the context of additive regression models. Our analysis also sheds an interesting light on how random forests can nicely adapt to sparsity.

MSC:
62G05 Nonparametric estimation
62G20 Asymptotic properties of nonparametric inference
PDF BibTeX XML Cite
Full Text: DOI Euclid arXiv
References:
[1] Amaratunga, D., Cabrera, J. and Lee, Y.-S. (2008). Enriched random forests. Bioinformatics 24 2010-2014.
[2] Bai, Z.-D., Devroye, L., Hwang, H.-K. and Tsai, T.-H. (2005). Maxima in hypercubes. Random Structures Algorithms 27 290-309. · Zbl 1080.60007 · doi:10.1002/rsa.20053
[3] Barndorff-Nielsen, O. and Sobel, M. (1966). On the distribution of the number of admissible points in a vector random sample. Teor. Verojatnost. i Primenen. 11 283-305. · Zbl 0278.60007
[4] Biau, G. (2012). Analysis of a random forests model. J. Mach. Learn. Res. 13 1063-1095. · Zbl 1283.62127 · www.jmlr.org
[5] Biau, G. and Devroye, L. (2010). On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification. J. Multivariate Anal. 101 2499-2518. · Zbl 1198.62048 · doi:10.1016/j.jmva.2010.06.019
[6] Biau, G., Devroye, L. and Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 9 2015-2033. · Zbl 1225.62081 · www.jmlr.org
[7] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities : A Nonasymptotic Theory of Independence . Oxford Univ. Press, Oxford. · Zbl 1279.60005
[8] Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123-140. · Zbl 0867.62055 · doi:10.1214/aos/1032181158
[9] Breiman, L. (2001). Random forests. Mach. Learn. 45 5-32. · Zbl 1007.68152 · doi:10.1023/A:1010933404324
[10] Breiman, L. (2004). Consistency for a simple model of random forests. Technical Report 670, Univ. California, Berkeley, CA.
[11] Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees . Wadsworth Advanced Books and Software, Belmont, CA. · Zbl 0541.62042
[12] Bühlmann, P. and Yu, B. (2002). Analyzing bagging. Ann. Statist. 30 927-961. · Zbl 1029.62037 · doi:10.1214/aos/1031689014 · euclid:aos/1031689014
[13] Clémençon, S., Depecker, M. and Vayatis, N. (2013). Ranking forests. J. Mach. Learn. Res. 14 39-73. · Zbl 1307.68065 · www.jmlr.org
[14] Cutler, D. R., Edwards, T. C. Jr, Beard, K. H., Cutler, A., Hess, K. T., Gibson, J. and Lawler, J. J. (2007). Random forests for classification in ecology. Ecology 88 2783-2792.
[15] Denil, M., Matheson, D. and Freitas, N. d. (2013). Consistency of online random forests. In Proceedings of the ICML Conference . Available at . arXiv:1302.4853 · arxiv.org
[16] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics ( New York ) 31 . Springer, New York. · Zbl 0853.68150
[17] Díaz-Uriarte, R. and Alvarez de Andrés, S. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7 1-13.
[18] Efron, B. (1982). The Jackknife , the Bootstrap and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics 38 . SIAM, Philadelphia. · Zbl 0496.62036
[19] Genuer, R. (2012). Variance reduction in purely random forests. J. Nonparametr. Stat. 24 543-562. · Zbl 1254.62050 · doi:10.1080/10485252.2012.677843
[20] Geurts, P., Ernst, D. and Wehenkel, L. (2006). Extremely randomized trees. Mach. Learn. 63 3-42. · Zbl 1110.68124 · doi:10.1007/s10994-006-6226-1
[21] Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression . Springer, New York. · Zbl 1021.62024 · doi:10.1007/b97848
[22] Hastie, T. and Tibshirani, R. (1986). Generalized additive models. Statist. Sci. 1 297-318. · Zbl 0645.62068 · doi:10.1214/ss/1177013604
[23] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning : Data Mining , Inference , and Prediction , 2nd ed. Springer, New York. · Zbl 1273.62005 · doi:10.1007/978-0-387-84858-7
[24] Ishwaran, H. and Kogalur, U. B. (2010). Consistency of random survival forests. Statist. Probab. Lett. 80 1056-1064. · Zbl 1190.62177 · doi:10.1016/j.spl.2010.02.020 · arxiv:0811.2844
[25] Ishwaran, H., Kogalur, U. B., Blackstone, E. H. and Lauer, M. S. (2008). Random survival forests. Ann. Appl. Stat. 2 841-860. · Zbl 1149.62331 · doi:10.1214/08-AOAS169
[26] Kleiner, A., Talwalkar, A., Sarkar, P. and Jordan, M. I. (2014). A scalable bootstrap for massive data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 795-816. · doi:10.1111/rssb.12050
[27] Lin, Y. and Jeon, Y. (2006). Random forests and adaptive nearest neighbors. J. Amer. Statist. Assoc. 101 578-590. · Zbl 1119.62304 · doi:10.1198/016214505000001230 · miranda.asa.catchword.org
[28] Meinshausen, N. (2006). Quantile regression forests. J. Mach. Learn. Res. 7 983-999. · Zbl 1222.68262 · www.jmlr.org
[29] Mentch, L. and Hooker, G. (2014). Ensemble trees and clts: Statistical inference for supervised learning. Available at . arXiv:1404.6473 · arxiv.org
[30] Nobel, A. (1996). Histogram regression estimation using data-dependent partitions. Ann. Statist. 24 1084-1105. · Zbl 0862.62038 · doi:10.1214/aos/1032526958
[31] Politis, D. N., Romano, J. P. and Wolf, M. (1999). Subsampling . Springer, New York. · Zbl 0931.62035
[32] Prasad, A. M., Iverson, L. R. and Liaw, A. (2006). Newer classification and regression tree techniques: Bagging and random forests for ecological prediction. Ecosystems 9 181-199.
[33] Scornet, E. (2014). On the asymptotics of random forests. Available at . arXiv:1409.2090 · arxiv.org
[34] Scornet, E., Biau, G. and Vert, J. (2015). Supplement to “Consistency of random forests.” . · Zbl 1317.62028 · doi:10.1214/15-AOS1321 · euclid:aos/1434546220 · arxiv:1405.2881
[35] Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M. and Moore, R. (2013). Real-time human pose recognition in parts from single depth images. Comm. ACM 56 116-124.
[36] Stone, C. J. (1977). Consistent nonparametric regression. Ann. Statist. 5 595-645. · Zbl 0366.62051 · doi:10.1214/aos/1176343886
[37] Stone, C. J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689-705. · Zbl 0605.62065 · doi:10.1214/aos/1176349548
[38] Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P. and Feuston, B. P. (2003). Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43 1947-1958.
[39] Wager, S. (2014). Asymptotic theory for random forests. Available at . arXiv:1405.0352 · arxiv.org
[40] Wager, S., Hastie, T. and Efron, B. (2014). Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. J. Mach. Learn. Res. 15 1625-1651. · Zbl 1319.62132 · jmlr.csail.mit.edu
[41] Zhu, R., Zeng, D. and Kosorok, M. R. (2012). Reinforcement learning trees. Technical report, Univ. North Carolina. · Zbl 1374.68466
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.