zbMATH — the first resource for mathematics

To explain or to predict? (English) Zbl 1329.62045
Summary: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.

62A01 Foundations and philosophical topics in statistics
Full Text: DOI Euclid
[1] Afshartous, D. and de Leeuw, J. (2005). Prediction in multilevel models. J. Educ. Behav. Statist. 30 109-139. · Zbl 1078.62072
[2] Aitchison, J. and Dunsmore, I. R. (1975). Statistical Prediction Analysis . Cambridge Univ. Press. · Zbl 0327.62043
[3] Bajari, P. and Hortacsu, A. (2003). The winner’s curse, reserve prices and endogenous entry: Empirical insights from ebay auctions. Rand J. Econ. 3 329-355.
[4] Bajari, P. and Hortacsu, A. (2004). Economic insights from internet auctions. J. Econ. Liter. 42 457-486.
[5] Bapna, R., Jank, W. and Shmueli, G. (2008). Price formation and its dynamics in online auctions. Decision Support Systems 44 641-656.
[6] Bell, R. M., Koren, Y. and Volinsky, C. (2008). The BellKor 2008 solution to the Netflix Prize.
[7] Bell, R. M., Koren, Y. and Volinsky, C. (2010). All together now: A perspective on the netflix prize. Chance 23 24. · Zbl 1060.91045
[8] Berk, R. A. (2008). Statistical Learning from a Regression Perspective . Springer, New York. · Zbl 1258.62047
[9] Bjornstad, J. F. (1990). Predictive likelihood: A review. Statist. Sci. 5 242-265. · Zbl 0955.62517
[10] Bohlmann, P. and Hothorn, T. (2007). Boosting algorithms: Regularization, prediction and model fitting. Statist. Sci. 22 477-505. · Zbl 1246.62163
[11] Breiman, L. (1996). Bagging predictors. Mach. Learn. 24 123-140. · Zbl 0867.62055
[12] Breiman, L. (2001a). Random forests. Mach. Learn. 45 5-32. · Zbl 1007.68152
[13] Breiman, L. (2001b). Statistical modeling: The two cultures. Statist. Sci. 16 199-215. · Zbl 1059.62505
[14] Brown, P. J., Vannucci, M. and Fearn, T. (2002). Bayes model averaging with selection of regressors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 519-536. JSTOR: · Zbl 1073.62004
[15] Campbell, J. Y. and Thompson, S. B. (2005). Predicting excess stock returns out of sample: Can anything beat the historical average? Harvard Institute of Economic Research Working Paper 2084.
[16] Carte, T. A. and Craig, J. R. (2003). In pursuit of moderation: Nine common errors and their solutions. MIS Quart. 27 479-501.
[17] Chakraborty, S. and Sharma, S. K. (2007). Prediction of corporate financial health by artificial neural network. Int. J. Electron. Fin. 1 442-459.
[18] Chen, S.-H., Ed. (2002). Genetic Algorithms and Genetic Programming in Computational Finance . Kluwer, Dordrecht.
[19] Collopy, F., Adya, M. and Armstrong, J. (1994). Principles for examining predictive-validity-the case of information-systems spending forecasts. Inform. Syst. Res. 5 170-179.
[20] Dalkey, N. and Helmer, O. (1963). An experimental application of the delphi method to the use of experts. Manag. Sci. 9 458-467.
[21] Dawid, A. P. (1984). Present position and potential developments: Some personal views: Statistical theory: The prequential approach. J. Roy. Statist. Soc. Ser. A 147 278-292. JSTOR: · Zbl 0557.62080
[22] Ding, Y. and Simonoff, J. (2010). An investigation of missing data methods for classification trees applied to binary response data. J. Mach. Learn. Res. 11 131-170. · Zbl 1242.62052
[23] Domingos, P. (2000). A unified bias-variance decomposition for zero-one and squared loss. In Proceedings of the Seventeenth National Conference on Artificial Intelligence 564-569. AAAI Press, Austin, TX.
[24] Dowe, D. L., Gardner, S. and Oppy, G. R. (2007). Bayes not bust! Why simplicity is no problem for Bayesians. Br. J. Philos. Sci. 58 709-754. · Zbl 1136.03301
[25] Dubin, R. (1969). Theory Building . The Free Press, New York.
[26] Edwards, J. R. and Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs. Psychological Methods 5 2 155-174.
[27] Ehrenberg, A. and Bound, J. (1993). Predictability and prediction. J. Roy. Statist. Soc. Ser. A 156 167-206.
[28] Fama, E. F. and French, K. R. (1993). Common risk factors in stock and bond returns. J. Fin. Econ. 33 3-56. · Zbl 1131.91335
[29] Farmer, J. D., Patelli, P. and Zovko, I. I. A. A. (2005). The predictive power of zero intelligence in financial markets. Proc. Natl. Acad. Sci. USA 102 2254-2259.
[30] Fayyad, U. M., Grinstein, G. G. and Wierse, A. (2002). Information Visualization in Data Mining and Knowledge Discovery . Morgan Kaufmann, San Francisco, CA.
[31] Feelders, A. (2002). Data mining in economic science. In Dealing with the Data Flood 166-175. STT/Beweton, Den Haag, The Netherlands.
[32] Findley, D. Y. and Parzen, E. (1998). A conversation with Hirotsugo Akaike. In Selected Papers of Hirotugu Akaike 3-16. Springer, New York. · Zbl 1148.01309
[33] Forster, M. (2002). Predictive accuracy as an achievable goal of science. Philos. Sci. 69 S124-S134.
[34] Forster, M. and Sober, E. (1994). How to tell when simpler, more unified, or less ad-hoc theories will provide more accurate predictions. Br. J. Philos. Sci. 45 1-35. · Zbl 1135.03310
[35] Friedman, J. H. (1997). On bias, variance, 0\?1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1 55-77.
[36] Gefen, D., Karahanna, E. and Straub, D. (2003). Trust and TAM in online shopping: An integrated model. MIS Quart. 27 51-90.
[37] Geisser, S. (1975). The predictive sample reuse method with applications. J. Amer. Statist. Assoc. 70 320-328. · Zbl 0321.62077
[38] Geisser, S. (1993). Predictive Inference: An Introduction . Chapman and Hall, London. · Zbl 0824.62001
[39] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2003). Bayesian Data Analysis , 2nd ed. Chapman & Hall/CRC New York/Boca Raton, FL. · Zbl 1279.62004
[40] Ghani, R. and Simmons, H. (2004). Predicting the end-price of online auctions. In International Workshop on Data Mining and Adaptive Modelling Methods for Economics and Management , Pisa, Italy.
[41] Goyal, A. and Welch, I. (2007). A comprehensive look at the empirical performance of equity premium prediction. Rev. Fin. Stud. 21 1455-1508.
[42] Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica 37 424-438. · Zbl 1366.91115
[43] Greenberg, E. and Parks, R. P. (1997). A predictive approach to model selection and multicollinearity. J. Appl. Econom. 12 67-75.
[44] Gurbaxani, V. and Mendelson, H. (1990). An integrative model of information systems spending growth. Inform. Syst. Res. 1 23-46.
[45] Gurbaxani, V. and Mendelson, H. (1994). Modeling vs. forecasting-the case of information-systems spending. Inform. Syst. Res. 5 180-190.
[46] Hagerty, M. R. and Srinivasan, S. (1991). Comparing the predictive powers of alternative multiple regression models. Psychometrika 56 77-85. · Zbl 04503083
[47] Hastie, T., Tibshirani, R. and Friedman, J. H. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2nd ed. Springer, New York. · Zbl 1273.62005
[48] Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46 1251-1271. JSTOR: · Zbl 0397.62043
[49] Helmer, O. and Rescher, N. (1959). On the epistemology of the inexact sciences. Manag. Sci. 5 25-52.
[50] Hempel, C. and Oppenheim, P. (1948). Studies in the logic of explanation. Philos. Sci. 15 135-175.
[51] Hitchcock, C. and Sober, E. (2004). Prediction versus accommodation and the risk of overfitting. Br. J. Philos. Sci. 55 1-34.
[52] Jaccard, J. (2001). Interaction Effects in Logistic Regression . SAGE Publications, Thousand Oaks, CA. · Zbl 0972.62046
[53] Jank, W. and Shmueli, G. (2010). Modeling Online Auctions . Wiley, New York. · Zbl 1198.91007
[54] Jank, W., Shmueli, G. and Wang, S. (2008). Modeling price dynamics in online auctions via regression trees. In Statistical Methods in eCommerce Research . Wiley, New York.
[55] Jap, S. and Naik, P. (2008). Bidanalyzer: A method for estimation and selection of dynamic bidding models. Marketing Sci. 27 949-960.
[56] Johnson, W. and Geisser, S. (1983). A predictive view of the detection and characterization of influential observations in regression analysis. J. Amer. Statist. Assoc. 78 137-144. JSTOR: · Zbl 0509.62055
[57] Kadane, J. B. and Lazar, N. A. (2004). Methods and criteria for model selection. J. Amer. Statist. Soc. 99 279-290. · Zbl 1089.62501
[58] Kendall, M. and Stuart, A. (1977). The Advanced Theory of Statistics 1 , 4th ed. Griffin, London. · Zbl 0353.62013
[59] Konishi, S. and Kitagawa, G. (2007). Information Criteria and Statistical Modeling . Springer, New York. · Zbl 1172.62003
[60] Krishna, V. (2002). Auction Theory . Academic Press, San Diego, CA.
[61] Little, R. J. A. (2007). Should we use the survey weights to weight? JPSM Distinguished Lecture, Univ. Maryland.
[62] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data . Wiley, New York. · Zbl 1011.62004
[63] Lucking-Reiley, D., Bryan, D., Prasad, N. and Reeves, D. (2007). Pennies from ebay: The determinants of price in online auctions. J. Indust. Econ. 55 223-233.
[64] Mackay, R. J. and Oldford, R. W. (2000). Scientific method, statistical method, and the speed of light. Working Paper 2000-02, Dept. Statistics and Actuarial Science, Univ. Waterloo. · Zbl 1059.62507
[65] Makridakis, S. G., Wheelwright, S. C. and Hyndman, R. J. (1998). Forecasting: Methods and Applications , 3rd ed. Wiley, New York.
[66] Montgomery, D., Peck, E. A. and Vining, G. G. (2001). Introduction to Linear Regression Analysis . Wiley, New York. · Zbl 0980.62051
[67] Mosteller, F. and Tukey, J. W. (1977). Data Analysis and Regression . Addison-Wesley, Reading, MA.
[68] Muller, J. and Brandl, R. (2009). Assessing biodiversity by remote sensing in mountainous terrain: The potential of lidar to predict forest beetle assemblages. J. Appl. Ecol. 46 897-905.
[69] Nabi, J., Kivimäki, M., Suominen, S., Koskenvuo, M. and Vahtera, J. (2010). Does depression predict coronary heart diseaseand cerebrovascular disease equally well? The health and social support prospective cohort study. Int. J. Epidemiol. 39 1016-1024.
[70] Palmgren, B. (1999). The need for financial models. ERCIM News 38 8-9.
[71] Parzen, E. (2001). Comment on statistical modeling: The two cultures. Statist. Sci. 16 224-226. · Zbl 1059.62505
[72] Patzer, G. L. (1995). Using Secondary Data in Marketing Research: United States and Worldwide . Greenwood Publishing, Westport, CT.
[73] Pavlou, P. and Fygenson, M. (2006). Understanding and predicting electronic commerce adoption: An extension of the theory of planned behavior. Mis Quart. 30 115-143.
[74] Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669-709. JSTOR: · Zbl 0860.62045
[75] Rosenbaum, P. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41-55. JSTOR: · Zbl 0522.62091
[76] Rubin, D. B. (1997). Estimating causal effects from large data sets using propensity scores. Ann. Intern. Med. 127 757-763.
[77] Saar-Tsechansky, M. and Provost, F. (2007). Handling missing features when applying classification models. J. Mach. Learn. Res. 8 1625-1657. · Zbl 1222.68295
[78] Sarle, W. S. (1998). Prediction with missing inputs. In JCIS 98 Proceedings (P. Wang, ed.) II 399-402. Research Triangle Park, Durham, NC.
[79] Seni, G. and Elder, J. F. (2010). Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Mining and Knowledge Discovery) . Morgan and Claypool, San Rafael, CA.
[80] Shafer, G. (1996). The Art of Causal Conjecture . MIT Press, Cambridge, MA. · Zbl 0874.60003
[81] Schapire, R. E. (1999). A brief introduction to boosting. In Proceedings of the Sixth International Joint Conference on Artificial Intelligence 1401-1406. Stockholm, Sweden.
[82] Shmueli, G. and Koppius, O. R. (2010). Predictive analytics in information systems research. MIS Quart.
[83] Simon, H. A. (2001). Science seeks parsimony, not simplicity: Searching for pattern in phenomena. In Simplicity, Inference and Modelling: Keeping it Sophisticatedly Simple 32-72. Cambridge Univ. Press.
[84] Sober, E. (2002). Instrumentalism, parsimony, and the Akaike framework. Philos. Sci. 69 S112-S123.
[85] Song, H. and Witt, S. F. (2000). Tourism Demand Modelling and Forecasting: Modern Econometric Approaches . Pergamon Press, Oxford.
[86] Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search , 2nd ed. MIT Press, Cambridge, MA. · Zbl 0806.62001
[87] Stone, M. (1974). Cross-validatory choice and assesment of statistical predictions (with discussion). J. Roy. Statist. Soc. Ser. B 39 111-147. JSTOR: · Zbl 0308.62063
[88] Taleb, N. (2007). The Black Swan . Penguin Books, London.
[89] Van Maanen, J., Sorensen, J. and Mitchell, T. (2007). The interplay between theory and method. Acad. Manag. Rev. 32 1145-1154.
[90] Vaughan, T. S. and Berry, K. E. (2005). Using Monte Carlo techniques to demonstrate the meaning and implications of multicollinearity. J. Statist. Educ. 13 online.
[91] Wallis, W. A. (1980). The statistical research group, 1942-1945. J. Amer. Statist. Assoc. 75 320-330. · Zbl 0466.62001
[92] Wang, S., Jank, W. and Shmueli, G. (2008). Explaining and forecasting online auction prices and their dynamics using functional data analysis. J. Business Econ. Statist. 26 144-160.
[93] Winkelmann, R. (2008). Econometric Analysis of Count Data , 5th ed. Springer, New York. · Zbl 1032.62108
[94] Woit, P. (2006). Not Even Wrong: The Failure of String Theory and the Search for Unity in Physical Law . Jonathan Cope, London. · Zbl 1128.81025
[95] Wu, S., Harris, T. and McAuley, K. (2007). The use of simplified or misspecified models: Linear case. Canad. J. Chem. Eng. 85 386-398.
[96] Zellner, A. (1962). An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias. J. Amer. Statist. Assoc. 57 348-368. JSTOR: · Zbl 0113.34902
[97] Zellner, A. (2001). Keep it sophisticatedly simple. In Simplicity, Inference and Modelling: Keeping It Sophisticatedly Simple 242-261. Cambridge Univ. Press. · Zbl 1019.62003
[98] Zhang, S., Jank, W. and Shmueli, G. (2010). Real-time forecasting of online auctions via functional k -nearest neighbors. Int. J. Forecast. 26 666-683.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.