×

Remembering Leo Breiman. (English) Zbl 1220.62002

Summary: Leo Breiman was a highly creative, influential researcher with a down-to-earth personal style and an insistence on working on important real world problems and producing useful solutions. This paper is a short review of Breiman’s extensive contributions to the field of applied statistics.

MSC:

62-03 History of statistics
01A70 Biographies, obituaries, personalia, bibliographies
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Berk, R. (2008). Statistical Learning from a Regression Perspective . Springer, New York. · Zbl 1258.62047
[2] Biau, G., Devroye, L. and Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. J. Mach. Learn. Res. 9 2039-2057. · Zbl 1225.62081
[3] Breiman, L. (1968). Probability Theory . Addison-Wesley, Reading, MA. [Republished (1991) in Classics of Mathematics . SIAM, Philadelphia, PA.] · Zbl 0174.48801
[4] Breiman, L. (1969). Probability and Stochastic Processes with a View Toward Applications . Houghton Mifflin, Boston, MA. · Zbl 0246.60033
[5] Breiman, L. (1973). Statistics: With a View Toward Applications . Houghton Mifflin Harcourt, Boston, MA. · Zbl 0289.62001
[6] Breiman, L. (1984). Nail finders, edifices, and Oz. Technical Report 32. Dept. Statistics, Univ. California, Berkeley, CA. Neyman-Kiefer Memorial Volume. · Zbl 1373.62019
[7] Breiman, L. (1991). The \Pi method for estimating multivariate functions from noisy data. Technometrics 33 125-143. · Zbl 0742.62037
[8] Breiman, L. (1992). The little bootstrap and other methods for dimensionality selection in regression: X -fixed prediction error. J. Amer. Statist. Assoc. 87 738-754. · Zbl 0850.62518
[9] Breiman, L. (1993a). Fitting additive models to regression data: Diagnostics and alternative views. J. Comput. Statist. Data Anal. 15 13-46. · Zbl 0937.62613
[10] Breiman, L. (1993b). Hinging hyperplanes for regression, classification, and function approximation. IEEE Trans. Inform. Theory 39 999-1013. · Zbl 0793.62031
[11] Breiman, L. (1994). The 1991 census adjustment: Undercount or bad data? Statist. Sci. 9 458-537.
[12] Breiman, L. (1995a). Better subset regression using the nonnegative garrote. Technometrics 37 373-384. · Zbl 0862.62059
[13] Breiman, L. (1995b). Reflections after refereeing papers for NIPS. In The Mathematics of Generalization: Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning, Volume 1992 (D. H. Wolpert, ed.). Westview Press, Boulder, CO.
[14] Breiman, L. (1996a). Stacked regressions. Mach. Learn. 24 49-64. · Zbl 0849.68104
[15] Breiman, L. (1996b). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350-2383. · Zbl 0867.62055
[16] Breiman, L. (1996c). Bagging predictors. Mach. Learn. 24 123-140. · Zbl 0858.68080
[17] Breiman, L. (1997a). Arcing the edge. Technical Report 486. Dept. Statistics, Univ. California, Berkeley, CA.
[18] Breiman, L. (1997b). Out-of-bag estimation. Technical report. Dept. Statistics, Univ. California, Berkeley, CA.
[19] Breiman, L. (1998a). Arcing classifiers. Ann. Statist. 26 801-849. · Zbl 0934.62064
[20] Breiman, L. (1998b). Using convex pseudo-data to increase prediction accuracy. Technical Report 513. Dept. Statistics, Univ. California, Berkeley, CA.
[21] Breiman, L. (1998c). Half & half bagging and hard boundary points. Technical Report 534. Dept. Statistics, Univ. California, Berkeley, CA.
[22] Breiman, L. (1999a). Prediction games and arcing algorithms. Neural Comput. 11 1493-1517.
[23] Breiman, L. (1999b). Pasting small votes for classification in large databases and on-line. Mach. Learn. 36 85-103.
[24] Breiman, L. (2000a). Randomizing outputs to increase prediction accuracy. Mach. Learning 40 229-242. · Zbl 0962.68143
[25] Breiman, L. (2000b). Some infinity theory for predictor ensembles. Technical Report 577. Dept. Statistics, Univ. California, Berkeley, CA.
[26] Breiman, L. (2001a). Random forests. Mach. Learn. 45 5-32. · Zbl 1007.68152
[27] Breiman, L. (2001b). Using iterated bagging to debias regressions. Mach. Learn. 45 261-277. · Zbl 1052.68109
[28] Breiman, L. (2001c). Statistical modeling: The two cultures. Statist. Sci. 16 199-231. · Zbl 1059.62505
[29] Breiman, L. (2004a). Consistency for a simple model of random forests. Technical Report 670. Dept. Statistics, Univ. California, Berkeley, CA. · Zbl 1105.62308
[30] Breiman, L. (2004b). Population theory for boosting ensembles. Ann. Statist. 32 1-11. · Zbl 1105.62308
[31] Breiman, L. and Cutler, A. (1993). A deterministic algorithm for global optimization. Math. Programming 58 179-199. · Zbl 0807.90103
[32] Breiman, L. and Freedman, D. (1983). How many variables should be entered in a regression equation? J. Amer. Statist. Assoc. 78 131-136. · Zbl 0513.62068
[33] Breiman, L. and Friedman, J. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-598. · Zbl 0594.62044
[34] Breiman, L. and Friedman, J. (1988). Comment on “Tree-structured classification via generalized discriminant analysis” by W. Y. Loh and N. Vanichsetakul. J. Amer. Statist. Assoc. 83 725-727. · Zbl 0649.62055
[35] Breiman, L. and Friedman, J. (1997). Predicting multivariate responses in multiple linear regression. J. Roy. Statist. Soc. Ser. B 59 3-54. · Zbl 0897.62068
[36] Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees . Wadsworth, New York. · Zbl 0541.62042
[37] Breiman, L. and Ihaka, R. (1984). Nonlinear discriminant analysis via scaling and ACE. Technical Report 40. Dept. Statistics, Univ. California, Berkeley, CA.
[38] Breiman, L. and Meisel, W. S. (1976). General estimates of the intrinsic variability of data in nonlinear regression models. J. Amer. Statist. Assoc. 71 301-307. · Zbl 0336.62069
[39] Breiman, L., Meisel, W. S. and Purcell, E. (1977). Variable kernel estimates of multivariate densities. Technometrics 19 135-144. · Zbl 0379.62023
[40] Breiman, L. and Peters, S. (1992). Comparing automatic smoothers (A public service enterprise). Int. Statist. Rev. 60 271-290. · Zbl 0775.62089
[41] Breiman, L. and Spector, P. (1992). Submodel selection and evaluation in regression. The X -random case. Int. Statist. Rev. 60 291-319. · Zbl 0850.62518
[42] Breiman, L., Tsur, Y. and Zemel, A. (1993). On a simple estimation procedure for censored regression models with known error distributions. Ann. Statist. 21 1711-1720. · Zbl 0790.62026
[43] Breiman, L. and Wurtele, Z. S. (1964). Convergence properties of a learning algorithm. Ann. Math. Statist. 35 1819-1822. · Zbl 0125.09904
[44] Bühlmann, P. and Yu, B. (2003). Boosting with the L2 loss: Regression and classification. J. Amer. Statist. Assoc. 98 324-339. · Zbl 1041.62029
[45] Bühlmann, P. and Yu, B. (2006). Sparse boosting. J. Mach. Learn. Res. 7 1001-1024. · Zbl 1222.68155
[46] Buja, A., Hastie, T. and Tibshirani, R. (1989). Linear smoothers and additive models. Ann. Statist. 17 453-510. · Zbl 0689.62029
[47] Chao, C., Liaw, A. and Breiman, L. (2004). Using random forests to learn imbalanced data. Technical Report 666. Dept. Statistics, Univ. California, Berkeley, CA.
[48] Cutler, A. and Breiman, L. (1994). Archetypal analysis. Technometrics 36 338-347. · Zbl 0804.62002
[49] Diaz-Uriarte, R. and Alvarez de Andres, S. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7 3.
[50] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1-26. · Zbl 0406.62024
[51] Efron, B. (2001). Comment on “Statistical modeling: The two cultures” by L. Breiman. Statist. Sci. 16 218-219. · Zbl 1059.01542
[52] Eugster, M. J. A. and Leisch, F. (2009). From Spider-Man to Hero-archetypal analysis in R. J. Statist. Soft. 30 1-23.
[53] Freund, A. (1995). Boosting a weak learning algorithm by majority. Inform. Comput. 121 256-285. · Zbl 0833.68109
[54] Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference 148-156. Morgan Kauffman, San Francisco, CA.
[55] Hastie, T. and Tibshirani, R. (1986). Generalized additive models. Statist. Sci. 1 297-310. · Zbl 0645.62068
[56] Hastie, T., Tibshirani, R. and Buja, A. (1994). Flexible discriminant analysis by optimal scoring. J. Amer. Statist. Assoc. 89 1255-1270. · Zbl 0812.62067
[57] Hastie, T., Tibshirani, R. and Friedman, J. (2000). Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann. Statist. 28 337-407. · Zbl 1106.62323
[58] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2nd ed. Springer, New York. · Zbl 1273.62005
[59] Hothorn, T., Bühlmann, P., Dudoit, S., Molinaro, A. and Van Der Laan, M. (2006). Survival ensembles. Biostatistics 7 355-373. · Zbl 1170.62385
[60] Ishwaran, H., Kogalur, U. B., Blackstone, E. H. and Lauer, M. S. (2008). Random survival forests. Ann. Appl. Statist. 2 841-860. · Zbl 1149.62331
[61] Izenman, A. (2008). Modern Multivariate Statistical Techniques . Springer, New York. · Zbl 1155.62040
[62] Jiang, W. (2004). Process consistency for AdaBoost. Ann. Statist. 32 13-29. · Zbl 1105.62316
[63] Liaw, A. and Wiener, M. (2002). Classification and regression by random forest. R News 2 18-22.
[64] Lin, Y. and Jeon, Y. (2006). Random forests and adaptive nearest neighbors. J. Amer. Statist. Assoc. 101 578-590. · Zbl 1119.62304
[65] Meinshausen, N. (2006). Quantile regression forests. J. Mach. Learn. Res. 7 983-999. · Zbl 1222.68262
[66] Minsky, M. and Papert, P. (1969). Perceptrons: An Introduction to Computational Geometry . MIT Press, Cambridge, MA. · Zbl 0197.43702
[67] Ng, V. W. and Breiman, L. (2005). Bivariate variable selection for classification problem. Technical Report 692. Dept. Statistics, Univ. California, Berkeley, CA.
[68] Olshen, R. (2001). A conversaton with Leo Breiman. Statist. Sci. 16 184-198. · Zbl 1059.01542
[69] Schapire, R. (1990). The strength of weak learnability. Mach. Learn. 5 197-227.
[70] Schapire, R., Freund, Y., Bartlett, P. and Lee, W. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Statist. 26 1651-1686. · Zbl 0929.62069
[71] Shang, N. and Breiman, L. (1996). Distribution based trees are more accurate. In Proceedings of the Int. Conf. on Neural Information Processing, Hong Kong 133-138. Springer, Singapore.
[72] Smith, P. (1982). Curve fitting and modeling with splines using statistical variable selection techniques. NASA Report 166034. NASA, Langley Research Center, Hampton, VA.
[73] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[74] Wolpert, D. (1992). Stacked generalization. Neural Networks 5 241-259.
[75] Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32 56-134. · Zbl 1105.62323
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.