×

A simulation based method for assessing the statistical significance of logistic regression models after common variable selection procedures. (English) Zbl 1385.62022

Summary: Classification models can demonstrate apparent prediction accuracy even when there is no underlying relationship between the predictors and the response. Variable selection procedures can lead to false positive variable selections and overestimation of true model performance. A simulation study was conducted using logistic regression with forward stepwise, best subsets, and LASSO variable selection methods with varying total sample sizes (20, 50, 100, 200) and numbers of random noise predictor variables (3, 5, 10, 15, 20, 50). Using our critical values can help reduce needless follow-up on variables having no true association with the outcome.

MSC:

62J12 Generalized linear models (logistic models)
62J07 Ridge regression; shrinkage estimators (Lasso)

Software:

covTest
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Agresti, A. (2007). An Introduction to Categorical Data Analysis2nd. ed.New Jersey: John Wiley and Sons, Inc. · Zbl 1266.62008
[2] Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control 19:716-723. · Zbl 0314.62039
[3] Beal, D. (2005). Selecting the best multiple linear regression model for multivariate data using information criteria. SESUG SAS Institute paper SA105_05.
[4] Begley, G. C., Ellis, M. (2012). Drug development: Raise standards for preclinical cancer research. Nature 483:531-533.
[5] Berk, R., Brown, L., Buja, A., Zhang, K., Zhao, L. (2013). Valid post-selection inference. Annals of Statistics 41:802-837. · Zbl 1267.62080
[6] Chen, W., Samuelson, F. W., Gallas, B. D., Kang, L., Sahiner, B., Petrick, N. (2013). The assessment of the added value of new predictive biomarkers. BMC Medical Research Methodology 13: 98.
[7] Efroymson, M. A. (1960). Multiple regression analysis: Mathematical Methods for Digital Computers. New York: John Wiley. · Zbl 0089.12602
[8] Fernandes–Taylor, S., Hyun, J. K., Reeder, R. N., Harris, A. H. S. (2011). Common statistical and research design problems in manuscripts submitted to high-impact medical journals. BMC Research Notes 4:304-308.
[9] Ferraris, V. A., Ferraris, S. P. (2003). Assessing the Medical Literature: Let the Buyer Beware. The Annals of Thoracic Surgery 76:4-11.
[10] Fong, D. Y. T., Lee, C. F., Lau, S. P. (2008). Contingency table analysis in obstetrics and gynaecology. Hong Kong Journal Gynaecology, Obstetrics and Midwifery 8:42-50.
[11] Gandhi, R., Smith, H. N., Mahomed, N. N, Rizek, R., Bhandari, M. (2011). Incorrect use of the student t test in randomized trials of bilateral hip and knee arthroplasty patients. Journal of Arthroplasty 26:811-816.
[12] Hanley, J. A., McNeail, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29-36.
[13] Harrell, F. E.Jr., Lee, K. L., Matchar, D. B., Reichert, T. A. (1985). Regression models for prognostic prediction: advantages, problems, and suggested solutions. Cancer Treatment Reports 69:1071-1077.
[14] Hiltzik, M. (2013). Science has lost its way, at a big cost to humanity. The LA Times. Retrieved fromhttp://www.latimes.com
[15] Hosmer, D. W., JovanovicB., Lemeshow, S. (1989). Best subsets logistic regression. Biometrics 45:1265-1270. · Zbl 0715.62125
[16] Lockhart, R., Taylor, J., Tibshirani, R.J., Tibshirani, R. (2014). A significance test for the lasso. The Annals of Statistics 42:413-468. · Zbl 1305.62254
[17] Lucena, C., Lopez, J. M., Abalos, C., Robles, V., Pulgar, R. (2011). Statistical errors in microleakage studies in operative dentistry.A survey of the literature 2001-2009. European Journal of Oral Sciences 119:504-510.
[18] Ludbrook, J., Dudley, H. (1998). Why permutation tests are superior to t and F tests in biomedical research. The American Statistician 52:127-132.
[19] Marozzi, M. (2014). Multivariate tests based on interpoint distances with application to magnetic resonance imaging. Statistical Methods in Medical Research 0:1-18. doi: 10.1177/0962280214529104.
[20] Marozzi, M. (2015a). Does bad inference drive out good?Clinical and Experimental Pharmacology and Physiology 42:727-733.
[21] Marozzi, M. (2015b). Multivariate multidistance tests for high-dimensional low sample size case-control studies. Statistics in Medicine 34:1511-1526.
[22] McKinney, P. W., Young, M. J., Hartz, A., Lee, M.B. (1989). The inexact use of fisher’s exact test in six major medical journals. Journal of the American Medical Association 261:3430-3433.
[23] Mosteller, F., Tukey, J. W. (1968). Data analysis, including statistics. Handbook of Social Psychology 2:1-26.
[24] Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology 49:1373-1379.
[25] Pesarin, F., Salmaso, L. (2010). Permutation Tests for Complex Data: Theory, Applications, and Software. New Jersey: John Wiley and Sons. · Zbl 1359.62158
[26] Podoll, A. S., Bell, C. S., Molony, D. A. (2012). Evidence-based practice in nephrology: critical appraisal of nephrology clinical research: Were the correct statistical tests used?Advances in Chronic Kidney Disease 19:27-33.
[27] Prinz, F., Schlange, T., Asadullah, K. (2011). Believe it or not: How much can we rely on published data on potential drug targets?Nature Reviews Drug Discovery 10(712):712-713.
[28] Reality Check on Reproducibility. (2016, May16). Editorial. Nature 533:437.
[29] Shen, H., Xu, W., Zhang, J., Chen, M., Martin, F.L., Xia, Y., hellip; Zhu, Y. G. (2013). Urinary metabolic biomarkers link oxidative stress indicators associated with general arsenic exposure to male infertility in a Han Chinese population. Environmental Science and Technology 47:8843-8851.
[30] Silver, N. (2012). The Signal and the Noise: Why So Many Predictions Fail-but Some Don’t. England: The Penguin Press.
[31] Steyerberg, W., Eijkemans, J. C., Harrell, F., Habbema, J. (2000). Prognostic modeling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Statistics in Medicine 19:1059-1079.
[32] Strasak, A. M., Zaman, Q., Marinell, G., Pfeiffer, K. P., Ulmer, H. (2007). The use of statistics in medical research: A comparison of the new England journal of medicine and nature medicine. The American Statistician 61:47-55.
[33] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society 58:267-288. · Zbl 0850.62538
[34] Walter, S., Tiemeier, H. (2009). Variable selection: current practice in epidemiological studies. European Journal of Epidemiology 12:733-736.
[35] Zhang, J., Huang, Z., Chen, M., Xia, Y., Martin, F. L., Hang, W., Shen, H. (2014). Urinary metabolome identifies signatures of oligozoospermic infertile men. Fertility and Sterility 102:44-53.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.