×

Rejoinder: Models as approximations. (English) Zbl 1440.62022

Summary: We respond to the discussants of our articles emphasizing the importance of inference under misspecification in the context of the reproducibility/replicability crisis. Along the way, we discuss the roles of diagnostics and model building in regression as well as connections between our well-specification framework and semiparametric theory.
Reply to the comments [Zbl 1440.62026; Zbl 1440.62024; Zbl 1440.62029; Zbl 1440.62027; Zbl 1440.62023; Zbl 1440.62031; Zbl 1440.62028; Zbl 1440.62025] to the authors’ papers [ibid. 34, No. 4, 523–544 (2019; Zbl 1440.62020); ibid. 34, No. 4, 545–565 (2019; Zbl 1440.62021)].

MSC:

62A01 Foundations and philosophical topics in statistics
62J05 Linear regression; mixed models
62J20 Diagnostics, and linear inference and regression
62D20 Causal inference from observational studies

References:

[1] Adam, D. (2019). Psychology’s reproducibility solution fails first test. Science 364 813. 10.1126/science.364.6443.813.
[2] Aronov, P. M. and Miller, B. T. (2019). Foundations of Agnostic Statistics. Cambridge Univ. Press, Cambridge.
[3] Athey, S. and Imbens, G. (2017). The econometrics of randomized experiments. In Handbook of Economic Field Experiments 1 73-140. Elsevier, Amsterdam.
[4] Azriel, D., Brown, L. D., Sklar, M., Berk, R., Buja, A. and Zhao, L. (2016). Semi-supervised linear regression. Available at arXiv:1612.02391.
[5] Berk, R., Buja, A., Brown, L., George, E., Kuchibhotla, A. K., Su, W. and Zhao, L. (2019). Assumption lean regression. Amer. Statist. 10.1080/00031305.2019.1592781.
[6] Berk, R., Olson, M., Buja, A. and Ouss, A. (2020). Using recursive partitioning to find and estimate heterogeneous treatment effects in randomized clinical trials. J. Exp. Criminol.. To appear. Available at arXiv.org/abs/1807.04164.
[7] Boos, D. D. (1992). On generalized score tests. Amer. Statist. 46 327-333.
[8] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-619. · Zbl 0594.62044 · doi:10.1080/01621459.1985.10478157
[9] Buja, A., Stuetzle, W. and Yi, S. (2005). Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications. Unpublished manuscript. Available at www-stat.wharton.upenn.edu/ buja.
[10] Cantoni, E. and Ronchetti, E. (2001). Robust inference for generalized linear models. J. Amer. Statist. Assoc. 96 1022-1030. · Zbl 1072.62610 · doi:10.1198/016214501753209004
[11] Davies, L. (2014). Data Analysis and Approximate Models: Model Choice, Location-Scale, Analysis of Variance, Nonparametric Regression and Image Analysis. Monographs on Statistics and Applied Probability 133. CRC Press, Boca Raton, FL. · Zbl 1360.62007
[12] Elliott, G., Ghanem, D. and Krüger, F. (2016). Forecasting conditional probabilities of binary outcomes under misspecification. The Review of Economics and Statistics 98 742-755.
[13] EMA, FDA (2017). ICH E9(R1) Addendum on Estimands and Sensitivity Analysis in Clinical Trials. www.ema.europa.eu/en/ich-e9-statistical-principles-clinical-trials, www.regulations.gov/docket?D=FDA-2017-D-6113.
[14] Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359-378. · Zbl 1284.62093 · doi:10.1198/016214506000001437
[15] Godambe, V. P. and Thompson, M. E. (1984). Robust estimation through estimating equations. Biometrika 71 115-125. · Zbl 0554.62030 · doi:10.1093/biomet/71.1.115
[16] Hartman, N. (2014). Who really found the Higgs Boson. https://getpocket.com/explore/item/who-really-found-the-higgs-boson.
[17] Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics 221-233. Univ. California Press, Berkeley, CA. · Zbl 0212.21504
[18] Ioannidis, J. P. A. (2005). Why most published research findings are false. Chance 18 40-47.
[19] Koller, M. and Stahel, W. A. (2017). Nonsingular subsampling for regression S estimators with categorical predictors. Comput. Statist. 32 631-646. · Zbl 1417.65043 · doi:10.1007/s00180-016-0679-x
[20] Kuchibhotla, A. K., Brown, L. D. and Buja, A. (2018a). Model-free study of ordinary least squares linear regression. Available at arXiv:1809.10538.
[21] Kuchibhotla, A. K., Brown, L. D., Buja, A., George, E. I. and Zhao, L. (2018b). A model free perspective for linear regression: Uniform-in-model bounds for post selection inference. Available at arXiv:1802.05801.
[22] Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J. and Wasserman, L. (2018). Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. 113 1094-1111. · Zbl 1402.62155 · doi:10.1080/01621459.2017.1307116
[23] McCarthy, D., Zhang, K., Brown, L. D., Berk, R., Buja, A., George, E. I. and Zhao, L. (2018). Calibrated percentile double bootstrap for robust linear regression inference. Statist. Sinica 28 2565-2589. · Zbl 1406.62076
[24] McCullagh, P. and Nelder, J. A. (1983). Generalized linear models. Chapman and Hall, London. · Zbl 0588.62104
[25] Newey, W. K. (1994). The asymptotic variance of semiparametric estimators. Econometrica 62 1349-1382. · Zbl 0816.62034 · doi:10.2307/2951752
[26] Newey, W. K., Hsieh, F. and Robins, J. M. (2004). Twicing kernels and a small bias property of semiparametric estimators. Econometrica 72 947-962. · Zbl 1091.62024 · doi:10.1111/j.1468-0262.2004.00518.x
[27] Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge. · Zbl 1188.68291
[28] Peters, J., Bühlmann, P. and Meinshausen, N. (2016). Causal inference by using invariant prediction: Identification and confidence intervals. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 947-1012. · Zbl 1414.62297 · doi:10.1111/rssb.12167
[29] Pitkin, E., Berk, R., Brown, L., Buja, A., George, E., Zhang, K. and Zhao, L. (2013). Improved precision in estimating average treatment effects. Available at arXiv:1311.0291.
[30] Shah, R. and Peters, J. (2018). The hardness of conditional independence testing and the generalised covariance measure. Available at arXiv:1804.07203.
[31] Simmons, J. P., Nelson, L. D. and Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22 1359-1366.
[32] Szpiro, A. A., Rice, K. M. and Lumley, T. (2010). Model-robust regression and a Bayesian “sandwich” estimator. Ann. Appl. Stat. 4 2099-2113. · Zbl 1220.62025 · doi:10.1214/10-AOAS362
[33] Steinberger, L. and Leeb, H. (2018). Conditional predictive inference for high-dimensional stable algorithms. Available at arXiv:1809.01412v1. · Zbl 1390.60067 · doi:10.3150/16-BEJ888
[34] Stoker, T. M. (1986). Consistent estimation of scaled coefficients. Econometrica 54 1461-1481. · Zbl 0628.62105 · doi:10.2307/1914309
[35] White, H. (1980a). Using least squares to approximate unknown regression functions. Internat. Econom. Rev. 21 149-170. · Zbl 0444.62119 · doi:10.2307/2526245
[36] White, H. · Zbl 0459.62051 · doi:10.2307/1912934
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.