×

zbMATH — the first resource for mathematics

Empirical risk minimization for heavy-tailed losses. (English) Zbl 1326.62066
Summary: The purpose of this paper is to discuss empirical risk minimization when the losses are not necessarily bounded and may have a distribution with heavy tails. In such situations, usual empirical averages may fail to provide reliable estimates and empirical risk minimization may provide large excess risk. However, some robust mean estimators proposed in the literature may be used to replace empirical means. In this paper, we investigate empirical risk minimization based on a robust estimate proposed by O. Catoni [Ann. Inst. Henri Poincaré, Probab. Stat. 48, No. 4, 1148–1185 (2012; Zbl 1282.62070)]. We develop performance bounds based on chaining arguments tailored to Catoni’s mean estimator.

MSC:
62F35 Robustness and adaptive procedures (parametric inference)
62F12 Asymptotic properties of parametric estimators
62G20 Asymptotic properties of nonparametric inference
62G32 Statistics of extreme values; tail inference
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX Cite
Full Text: DOI Euclid
References:
[1] Abaya, E. F. and Wise, G. L. (1984). Convergence of vector quantizers with applications to optimal quantization. SIAM J. Appl. Math. 44 183-189. · Zbl 0532.60038
[2] Alon, N., Matias, Y. and Szegedy, M. (1999). The space complexity of approximating the frequency moments. J. Comput. System Sci. 58 137-147. · Zbl 0938.68153
[3] Antos, A. (2005). Improved minimax bounds on the test and training distortion of empirically designed vector quantizers. IEEE Trans. Inform. Theory 51 4022-4032. · Zbl 1284.94038
[4] Antos, A., Györfi, L. and György, A. (2005). Individual convergence rates in empirical vector quantizer design. IEEE Trans. Inform. Theory 51 4013-4022. · Zbl 1284.94039
[5] Audibert, J.-Y. and Catoni, O. (2011). Robust linear least squares regression. Ann. Statist. 39 2766-2794. · Zbl 1231.62126
[6] Bartlett, P. L., Linder, T. and Lugosi, G. (1998). The minimax distortion redundancy in empirical quantizer design. IEEE Trans. Inform. Theory 44 1802-1813. · Zbl 0964.94015
[7] Bartlett, P. L. and Mendelson, S. (2006). Empirical minimization. Probab. Theory Related Fields 135 311-334. · Zbl 1142.62348
[8] Biau, G., Devroye, L. and Lugosi, G. (2008). On the performance of clustering in Hilbert spaces. IEEE Trans. Inform. Theory 54 781-790. · Zbl 1304.62088
[9] Boucheron, S., Bousquet, O. and Lugosi, G. (2005). Theory of classification: A survey of some recent advances. ESAIM Probab. Stat. 9 323-375. · Zbl 1136.62355
[10] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities : A Nonasymptotic Theory of Independence , with a Foreword by Michel Ledoux . Oxford Univ. Press, Oxford. · Zbl 1279.60005
[11] Bubeck, S., Cesa-Bianchi, N. and Lugosi, G. (2013). Bandits with heavy tail. IEEE Trans. Inform. Theory 59 7711-7717. · Zbl 1364.62213
[12] Catoni, O. (2012). Challenging the empirical mean and empirical variance: A deviation study. Ann. Inst. Henri Poincaré Probab. Stat. 48 1148-1185. · Zbl 1282.62070
[13] Dudley, R. M. (1978). Central limit theorems for empirical measures. Ann. Probab. 6 899-929. · Zbl 0404.60016
[14] Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997). Modelling Extremal Events for Insurance and Finance . Springer, Berlin. · Zbl 0873.62116
[15] Fama, E. F. (1963). Mandelbrot and the stable Paretian hypothesis. The Journal of Business 36 420-429.
[16] Finkenstadt, B. and Rootzén, H. (2003). Extreme Values in Finance , Telecommunications and the Enviroment . Chapman & Hall, New York.
[17] Hsu, D. and Sabato, S. (2013). Approximate loss minimization with heavy tails. Preprint. Available at . arXiv:1307.1827
[18] Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist. 34 2593-2656. · Zbl 1118.62065
[19] Lerasle, M. and Oliveira, R. I. (2012). Robust empirical mean estimators. Manuscript.
[20] Levrard, C. (2013). Fast rates for empirical vector quantization. Electron. J. Stat. 7 1716-1746. · Zbl 1349.62038
[21] Linder, T. (2002). Learning-theoretic methods in vector quantization. In Principles of Nonparametric Learning ( Udine , 2001) (L. Györfi, ed.). CISM Courses and Lectures 434 163-210. Springer, Vienna.
[22] Mandelbrot, B. (1963). The variation of certain speculative prices. The Journal of Business 36 394-419.
[23] Massart, P. (2007). Concentration Inequalities and Model Selection. Lecture Notes in Math. 1896 . Springer, Berlin. · Zbl 1170.60006
[24] Matoušek, J. (2002). Lectures on Discrete Geometry. Graduate Texts in Mathematics 212 . Springer, New York. · Zbl 0999.52006
[25] Maurer, A. and Pontil, M. (2010). \(K\)-dimensional coding schemes in Hilbert spaces. IEEE Trans. Inform. Theory 56 5839-5846. · Zbl 1366.94305
[26] Mendelson, S. (2014). Learning without concentration. Preprint. Available at . arXiv:1401.0304 · Zbl 1333.68232
[27] Minsker, S. (2015). Geometric median and robust estimation in Banach spaces. Bernoulli 21 2308-2335. · Zbl 1348.60041
[28] Nemirovsky, A. S. and Yudin, D. B. (1983). Problem Complexity and Method Efficiency in Optimization . Wiley, New York. · Zbl 0501.90062
[29] Pollard, D. (1981). Strong consistency of \(k\)-means clustering. Ann. Statist. 9 135-140. · Zbl 0451.62048
[30] Pollard, D. (1982). A central limit theorem for \(k\)-means clustering. Ann. Probab. 10 919-926. · Zbl 0502.62055
[31] Pollard, D. (1982). Quantization and the method of \(k\)-means. IEEE Trans. Inform. Theory 28 199-205. · Zbl 0476.94010
[32] Talagrand, M. (2005). The Generic Chaining . Springer, Berlin. · Zbl 1075.60001
[33] Telgarsky, M. and Dasgupta, S. (2013). Moment-based uniform deviation bounds for \(k\)-means and friends. Preprint. Available at . arXiv:1311.1903
[34] van de Geer, S. A. (2000). Applications of Empirical Process Theory . Cambridge Univ. Press, Cambridge. · Zbl 0953.62049
[35] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes . Springer, New York. · Zbl 0862.60002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.