Statistical modeling: The two cultures. (With comments and a rejoinder). (English) Zbl 1059.62505

Summary: There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.


62A01 Foundations and philosophical topics in statistics
Full Text: DOI


[1] Amit, Y. and Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation 9 1545- 1588.
[2] Arena, C., Sussman, N., Chiang, K., Mazumdar, S., Macina, O. and Li, W. (2000). Bagging Structure-Activity Relationships: A simulation study for assessing misclassification rates. Presented at the Second Indo-U.S. Workshop on Mathematical Chemistry, Duluth, MI. (Available at NSussman@server.ceoh.pitt.edu).
[3] Bickel, P., Ritov, Y. and Stoker, T. (2001). Tailor-made tests for goodness of fit for semiparametric hy potheses. Unpublished manuscript. · Zbl 1092.62050
[4] Breiman, L. (1998). Arcing classifiers. Discussion paper, Ann. Statist. 26 801-824. · Zbl 0934.62064
[5] Breiman, L. (2000). Some infinity theory for tree ensembles. (Available at www.stat.berkeley.edu/technical reports). URL:
[6] Breiman, L. (2001). Random forests. Machine Learning J. 45 5- 32. · Zbl 1007.68152
[7] Breiman, L. and Friedman, J. (1985). Estimating optimal transformations in multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-619. JSTOR: · Zbl 0594.62044
[8] Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA. · Zbl 0541.62042
[9] Cristianini, N. and Shawe-Tay lor, J. (2000). An Introduction to Support Vector Machines. Cambridge Univ. Press.
[10] Daniel, C. and Wood, F. (1971). Fitting equations to data. Wiley, New York. · Zbl 0264.65011
[11] Dempster, A. (1998). Logicist statistic 1. Models and Modeling. Statist. Sci. 13 3 248-276. · Zbl 1099.62501
[12] Diaconis, P. and Efron, B. (1983). Computer intensive methods in statistics. Scientific American 248 116-131. · Zbl 0555.62037
[13] Domingos, P. (1998). Occam’s two razors: the sharp and the blunt. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (R. Agrawal and P. Stolorz, eds.) 37-43. AAAI Press, Menlo Park, CA.
[14] Domingos, P. (1999). The role of Occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3 409-425.
[15] Dudoit, S., Fridly and, J. and Speed, T. (2000). Comparison of discrimination methods for the classification of tumors. (Available at www.stat.berkeley.edu/technical reports). URL:
[16] Freedman, D. (1987). As others see us: a case study in path analysis (with discussion). J. Ed. Statist. 12 101-223.
[17] Freedman, D. (1991). Statistical models and shoe leather. Sociological Methodology 1991 (with discussion) 291-358.
[18] Freedman, D. (1991). Some issues in the foundations of statistics. Foundations of Science 1 19-83. · Zbl 0945.62004
[19] Freedman, D. (1994). From association to causation via regression. Adv. in Appl. Math. 18 59-110. · Zbl 0873.90019
[20] Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm. In Machine Learning: Proceedings of the Thirteenth International Conference 148-156. Morgan Kaufmann, San Francisco.
[21] Friedman, J. (1999). Greedy predictive approximation: a gradient boosting machine. Technical report, Dept. Statistics Stanford Univ.
[22] Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. Ann. Statist. 28 337-407. · Zbl 1106.62323
[23] Gifi, A. (1990). Nonlinear Multivariate Analy sis. Wiley, New York. · Zbl 0697.62048
[24] Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. Pattern Analy sis and Machine Intelligence 20 832-844.
[25] Landswher, J., Preibon, D. and Shoemaker, A. (1984). Graphical methods for assessing logistic regression models (with discussion). J. Amer. Statist. Assoc. 79 61-83. · Zbl 0531.65080
[26] McCullagh, P. and Nelder, J. (1989). Generalized Linear Models. Chapman and Hall, London. · Zbl 0744.62098
[27] Meisel, W. (1972). Computer-Oriented Approaches to Pattern Recognition. Academic Press, New York. · Zbl 0252.68063
[28] Michie, D., Spiegelhalter, D. and Tay lor, C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York. · Zbl 0827.68094
[29] Mosteller, F. and Tukey, J. (1977). Data Analy sis and Regression. Addison-Wesley, Redding, MA.
[30] Mountain, D. and Hsiao, C. (1989). A combined structural and flexible functional approach for modelenery substitution. J. Amer. Statist. Assoc. 84 76-87.
[31] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. B 36 111-147. JSTOR: · Zbl 0308.62063
[32] Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer, New York. · Zbl 0833.62008
[33] Vapnik, V (1998). Statistical Learning Theory. Wiley, New York. · Zbl 0935.62007
[34] Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia. · Zbl 0813.62001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.