×

zbMATH — the first resource for mathematics

Computationally efficient confidence intervals for cross-validated area under the ROC curve estimates. (English) Zbl 1327.62298
Summary: In binary classification problems, the area under the ROC curve (AUC) is commonly used to evaluate the performance of a prediction model. Often, it is combined with cross-validation in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we obtain an estimate of its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, the process of cross-validating a predictive model on even a relatively small data set can still require a large amount of computation time. Thus, in many practical settings, the bootstrap is a computationally intractable approach to variance estimation. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC.
MSC:
62G15 Nonparametric tolerance and confidence regions
62G05 Nonparametric estimation
62G20 Asymptotic properties of nonparametric inference
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Ling, C., Huang, J., and Zhang, H. (2003). AUC: a statistically consistent and more discriminating measure than accuracy., Proceedings of IJCAI 2003 .
[2] Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms., Pattern Recognition 30 , 1145-1159.
[3] Geisser, S. (1975). The predictive sample reuse method with applications., Amer. Statist. Assoc. 70 , 320-328. · Zbl 0321.62077 · doi:10.2307/2285815
[4] Kleiner, A., Talwalkar, A., Sarkar, P., and Jordan, M. (2013). A scalable bootstrap for massive data., Journal of the Royal Statistical Society, Series B .
[5] Sing, T., Sander, O., Beerenwinkel, N., and Lengauer, T. (2005). ROCR: Visualizing classifier performance in R., Bioinformatics 21 , 20, 3940-3941.
[6] Venables, W. N. and Ripley, B. D. (2002)., Modern Applied Statistics with S , Fourth ed. Springer, New York. · Zbl 1006.62003 · doi:10.1007/b97626
[7] Allen, D. M. (1974). The relationship between variable selection and data augmentation and a method for prediction., Technometrics 16 , 125-127. · Zbl 0286.62044 · doi:10.2307/1267500
[8] Bezanson, J., Karpinski, S., Shah, V. B., and Edelman, A. (2012). Julia: A fast dynamic language for technical computing., CoRR abs/1209.5145 . http://arxiv.org/abs/1209.5145. · Zbl 1356.68030
[9] Bickel, P. J., Götze, F., and van Zwet, W. R. (1997). Resampling fewer than \(n\) observations: gains, losses, and remedies for losses., Statist. Sinica 7 , 1, 1-31. Empirical Bayes, sequential analysis and related topics in statistics and probability (New Brunswick, NJ, 1995). · Zbl 0927.62043
[10] Bickel, P. J., Klaassen, C. A. J., Ritov, Y., and Wellner, J. A. (1993)., Efficient and adaptive estimation for semiparametric models . Johns Hopkins Series in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD. · Zbl 0786.62001
[11] Efron, B. (1979). Bootstrap methods: another look at the jackknife., Ann. Statist. 7 , 1, 1-26. · Zbl 0406.62024 · doi:10.1214/aos/1176344552
[12] Efron, B. and Tibshirani, R. J. (1993)., An introduction to the bootstrap . Monographs on Statistics and Applied Probability, Vol. 57 . Chapman and Hall, New York. · Zbl 0835.62038
[13] Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent., Journal of Statistical Software 33 , 1, 1-22. http://www.jstatsoft.org/v33/i01/.
[14] Gill, R. D. (1989). Non- and semi-parametric maximum likelihood estimators and the von Mises method. I., Scand. J. Statist. 16 , 2, 97-128. With a discussion by J. A. Wellner and J. Præstgaard and a reply by the author. · Zbl 0688.62026
[15] Kornblith, S. (2014)., GLMNet.jl: Julia wrapper for fitting Lasso/ElasticNet GLM models using glmnet . Commit version 0526df8455, https:// github.com/simonster/GLMNet.jl.
[16] LeDell, E., Petersen, M., and van der Laan, M. (2013)., cvAUC: Cross-Validated Area Under the ROC Curve Confidence Intervals . R package version 1.0-0, http://CRAN.R-project.org/package=cvAUC.
[17] Lin, D. (2014)., A set of functions to support the development of machine learning algorithms . v0.4.2, https://github.com/JuliaStats/MLBase.jl.
[18] Lin, D. and White, J. M. (2014)., A Julia package for probability distributions and associated functions . v0.5.4, https://github.com/ JuliaStats/Distributions.jl.
[19] Politis, D. N., Romano, J. P., and Wolf, M. (1999)., Subsampling . Springer Series in Statistics. Springer-Verlag, New York. http://dx.doi.org/ 10.1007/978-1-4612-1554-7. · Zbl 0931.62035
[20] Shao, J. (1993). Linear model selection by cross-validation., J. Amer. Statist. Assoc. 88 , 422, 486-494. · Zbl 0773.62051 · doi:10.2307/2290328
[21] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions., J. Roy. Statist. Soc. Ser. B 36 , 111-147. With discussion by G. A. Barnard, A. C. Atkinson, L. K. Chan, A. P. Dawid, F. Downton, J. Dickey, A. G. Baker, O. Barndorff-Nielsen, D. R. Cox, S. Giesser, D. Hinkley, R. R. Hocking, and A. S. Young, and with a reply by the authors. · Zbl 0308.62063
[22] van der Vaart, A. W. and Wellner, J. A. (1996)., Weak convergence and empirical processes . Springer Series in Statistics. Springer-Verlag, New York. With applications to statistics. · Zbl 0862.60002
[23] Zheng, W. and van der Laan, M. J. (2011). Targeted maximum likelihood estimation of natural direct effect. Tech. Rep. 288, U.C. Berkeley Division of Biostatistics Working Paper, Series.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.