Conditional validity of inductive conformal predictors. (English) Zbl 1273.68307

Summary: Conformal predictors are set predictors that are automatically valid in the sense of having coverage probability equal to or exceeding a given confidence level. Inductive conformal predictors are a computationally efficient version of conformal predictors satisfying the same property of validity. However, inductive conformal predictors have only been known to control unconditional coverage probability. This paper explores various versions of conditional validity and various ways to achieve them using inductive conformal predictors and their modifications. In particular, it discusses a convenient expression of one of the modifications in terms of ROC curves.


68T05 Learning and adaptive systems in artificial intelligence
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI arXiv


[1] Balasubramanian, V. N., Ho, S. S., & Vovk, V. (Eds.) (2013). Conformal prediction for reliable machine learning: theory, adaptations, and applications. Waltham: Elsevier (to appear). · Zbl 1290.68003
[2] Bengio, S.; Mariéthoz, J.; Keller, M., The expected performance curve, (2005)
[3] Frank, A., & Asuncion, A. (2010). UCI machine learning repository. URL http://archive.ics.uci.edu/ml.
[4] Fraser, D. A. S. (1957). Nonparametric methods in statistics. New York: Wiley. · Zbl 0077.12903
[5] Fraser, D. A. S.; Wormleighton, R., Nonparametric estimation IV, The Annals of Mathematical Statistics, 22, 294-298, (1951) · Zbl 0043.34804
[6] Freund, Y.; Schapire, R. E., A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences, 55, 119-139, (1997) · Zbl 0880.68103
[7] Friedman, J. H., Greedy function approximation: a gradient boosting machine, The Annals of Statistics, 29, 1189-1232, (2001) · Zbl 1043.62034
[8] Friedman, J. H., Stochastic gradient boosting, Computational Statistics & Data Analysis, 38, 367-378, (2002) · Zbl 1072.65502
[9] Guttman, I. (1970). Statistical tolerance regions: classical and Bayesian. London: Griffin. · Zbl 0231.62052
[10] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction (2nd ed.). New York: Springer. · Zbl 1273.62005
[11] Langford, J., Tutorial on practical prediction theory for classification, Journal of Machine Learning Research, 6, 273-306, (2005) · Zbl 1222.68243
[12] Lei, J., & Wasserman, L. (2013). Distribution free prediction bands for nonparametric regression. Journal of the Royal Statistical Society B (to appear), preliminary version published as Technical Report. arXiv:1203.5422 [stat.ME].
[13] Lei, J.; Robins, J.; Wasserman, L., Distribution free prediction sets, Journal of the American Statistical Association, 108, 278-287, (2013) · Zbl 06158342
[14] Maindonald, J., & Braun, J. (2007). Data analysis and graphics using R: an example-based approach (2nd ed.). Cambridge: Cambridge University Press. · Zbl 1115.62008
[15] McCullagh, P.; Vovk, V.; Nouretdinov, I.; Devetyarov, D.; Gammerman, A., Conditional prediction intervals for linear regression, Miami, FL, December 13-15
[16] National Institute of Standards and Technology (2012). Digital library of mathematical functions. URL http://dlmf.nist.gov/.
[17] Nouretdinov, I. R., Offline nearest neighbour transductive confidence machine, 16-24, (2008)
[18] Papadopoulos, H.; Proedrou, K.; Vovk, V.; Gammerman, A.; Elomaa, T. (ed.); Mannila, H. (ed.); Toivonen, H. (ed.), Inductive confidence machines for regression, Helsinki, August 19-23, 2002, Berlin · Zbl 1014.68514
[19] Papadopoulos, H.; Vovk, V.; Gammerman, A., Qualified predictions for large data sets in the case of pattern recognition, Las Vegas, NV, June 24-27, 2002, Las Vegas
[20] Papadopoulos, H., Gammerman, A., & Vovk, V. (Eds.) (2013). Special Issue of the Annals of Mathematics and Artificial Intelligence on Conformal Prediction and its Applications. Springer (to appear).
[21] Saunders, C.; Gammerman, A.; Vovk, V.; Dean, T. (ed.), Transduction with confidence and credibility, Stockholm, July 31-August 6, 1999, San Francisco
[22] Scheffé, H.; Tukey, J. W., Nonparametric estimation I: validation of order statistics, The Annals of Mathematical Statistics, 16, 187-192, (1945) · Zbl 0060.30511
[23] Tsybakov, A. B. (2010). Introduction to nonparametric estimation. New York: Springer. · Zbl 1176.62032
[24] Tukey, J. W., Nonparametric estimation II: statistically equivalent blocks and tolerance regions - the continuous case, The Annals of Mathematical Statistics, 18, 529-539, (1947) · Zbl 0029.15502
[25] Tukey, J. W., Nonparametric estimation III: statistically equivalent blocks and tolerance regions - the discontinuous case, The Annals of Mathematical Statistics, 19, 30-39, (1948) · Zbl 0032.29501
[26] Vanderlooy, S.; Sprinkhuizen-Kuyper, I. G.; Kok, J. N. (ed.); Koronacki, J. (ed.); Mántaras, R. L. (ed.); Matwin, S. (ed.); Mladenic, D. (ed.); Skowron, A. (ed.), A comparison of two approaches to classify with guaranteed performance, Warsaw, September 17-21, 2007, Berlin
[27] Vanderlooy, S.; Maaten, L.; Sprinkhuizen-Kuyper, I.; Perner, P. (ed.), Off-line learning with transductive confidence machines: an empirical evaluation, Leipzig, Germany, July 18-20, 2007, Berlin
[28] Vovk, V., On-line confidence machines are well-calibrated, Vancouver, November 16-19, 2002, Los Alamitos
[29] Vovk, V.; Hoi, S. C. H. (ed.); Buntine, W. (ed.), Conditional validity of inductive conformal predictors, No. 25, 475-490, (2012)
[30] Vovk, V.; Gammerman, A.; Saunders, C., Machine-learning applications of algorithmic randomness, Bled, Slovenia, June 27-30, 1999, San Francisco
[31] Vovk, V., Gammerman, A., & Shafer, G. (2005). Algorithmic learning in a random world. New York: Springer. · Zbl 1105.68052
[32] Wilks, S. S., Determination of sample sizes for setting tolerance limits, The Annals of Mathematical Statistics, 12, 91-96, (1941) · JFM 67.0481.04
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.