Concentration of weakly dependent Banach-valued sums and applications to statistical learning methods. (English) Zbl 1428.62185

Summary: We obtain a Bernstein-type inequality for sums of Banach-valued random variables satisfying a weak dependence assumption of general type and under certain smoothness assumptions of the underlying Banach norm. We use this inequality in order to investigate in the asymptotical regime the error upper bounds for the broad family of spectral regularization methods for reproducing kernel decision rules, when trained on a sample coming from a \(\tau\)-mixing process.


60E15 Inequalities; stochastic orderings
60B11 Probability theory on linear topological spaces
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI arXiv Euclid


[1] Andrews, D.W.K. (1984). Nonstrong mixing autoregressive processes. J. Appl. Probab.21 930-934. · Zbl 0552.60049
[2] Andrews, D.W.K. (1988). Laws of large numbers for dependent nonidentically distributed random variables. Econometric Theory4 458-467.
[3] Argyriou, A. and Dinuzzo, F. (2014). A unifying view of representer theorems. In International Conference on Machine Learning 31 (ICML 2014) (E.P. Xing and T. Jebara, eds.). Proceedings of Machine Learning Research32 748-756.
[4] Bauer, F., Pereverzev, S. and Rosasco, L. (2007). On regularization algorithms in learning theory. J. Complexity23 52-72. · Zbl 1109.68088
[5] Benett, K. and Bredensteiner, J. (2000). Duality and geometry in support vector machine classifiers. In International Conference on Machine Learning 17 (ICML 2000) (P. Langley, ed.) 57-64.
[6] Bernstein, S. (1924). On a modification of Chebyschev’s inequality and of the error formula of Laplace. Ann. Sci. Inst. Sav. Ukraine, Sect. Math4.
[7] Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics169. New York: Springer. · Zbl 0863.15001
[8] Bickel, P.J. and Bühlmann, P. (1999). A new mixing notion and functional central limit theorems for a sieve bootstrap in time series. Bernoulli5 413-446. · Zbl 0954.62102
[9] Blanchard, G., Lee, G. and Scott, C. (2011). Generalizing from several related classification tasks to a new unlabeled sample. In Advances in Neural Inf. Proc. Systems 24 (NIPS 2011) (J. Shawe-Taylor, R.S. Zemel, P.L. Bartlett, F. Pereira and K.Q. Weinberger, eds.) 2438-2446.
[10] Blanchard, G. and Mücke, N. (2018). Optimal rates for regularization of statistical inverse learning problems. Found. Comput. Math.18 971-1013. · Zbl 1412.62042
[11] Bosq, D. (1993). Bernstein-type large deviations inequalities for partial sums of strong mixing processes. Statistics24 59-70. · Zbl 0810.60027
[12] Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Lecture Notes in Statistics149. New York: Springer. · Zbl 0962.60004
[13] Bradley, R.C. (2005). Basic properties of strong mixing conditions. A survey and some open questions. Probab. Surv.2 107-144. · Zbl 1189.60077
[14] Canu, S., Mary, X. and Rakotomamonjy, A. (2003). Functional learning through kernel. 5 89-110. IOS Press.
[15] Caponnetto, A. and De Vito, E. (2007). Optimal rates for the regularized least-squares algorithm. Found. Comput. Math.7 331-368. · Zbl 1129.68058
[16] Combettes, P.L., Salzo, S. and Villa, S. (2018). Regularized learning schemes in feature Banach spaces. Anal. Appl. (Singap.) 16 1-54. · Zbl 1378.62015
[17] De Vito, E., Rosasco, L. and Caponnetto, A. (2006). Discretization error analysis for Tikhonov regularization. Anal. Appl. (Singap.) 4 81-99. · Zbl 1088.65056
[18] Dedecker, J., Doukhan, P., Lang, G., León R., J.R., Louhichi, S. and Prieur, C. (2007). Weak Dependence: With Examples and Applications. Lecture Notes in Statistics190. New York: Springer. · Zbl 1165.62001
[19] Dedecker, J. and Merlevède, F. (2015). Moment bounds for dependent sequences in smooth Banach spaces. Stochastic Process. Appl.125 3401-3429. · Zbl 1318.60028
[20] Doukhan, P. and Louhichi, S. (1999). A new weak dependence condition and applications to moment inequalities. Stochastic Process. Appl.84 313-342. · Zbl 0996.60020
[21] Engl, H.W., Hanke, M. and Neubauer, A. (1996). Regularization of Inverse Problems. Mathematics and Its Applications375. Dordrecht: Kluwer Academic. · Zbl 0859.65054
[22] Esary, J.D., Proschan, F. and Walkup, D.W. (1967). Association of random variables, with applications. Ann. Math. Stat.38 1466-1474. · Zbl 0183.21502
[23] Fan, X., Grama, I. and Liu, Q. (2015). Exponential inequalities for martingales with applications. Electron. J. Probab.20 1-22. · Zbl 1320.60058
[24] Fortuin, C.M., Kasteleyn, P.W. and Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Comm. Math. Phys.22 89-103. · Zbl 0346.06011
[25] Freedman, D.A. (1975). On tail probabilities for martingales. Ann. Probab.3 100-118. · Zbl 0313.60037
[26] Hang, H. and Steinwart, I. (2017). A Bernstein-type inequality for some mixing processes and dynamical systems with an application to learning. Ann. Statist.45 708-743. · Zbl 1388.60060
[27] Hein, M., Bousquet, O. and Schölkopf, B. (2005). Maximal margin classification for metric spaces. J. Comput. System Sci.71 333-359. · Zbl 1094.68084
[28] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc.58 13-30. · Zbl 0127.10602
[29] Horváth, L. and Kokoszka, P. (2012). Inference for Functional Data with Applications. Springer Series in Statistics. New York: Springer. · Zbl 1279.62017
[30] Ibragimov, I.A. (1959). Some limit theorems for stochastic processes stationary in the strict sense. Dokl. Akad. Nauk SSSR125 711-714. · Zbl 0087.13302
[31] Jirak, M. (2018). Rate of convergence for Hilbert space valued processes. Bernoulli24 202-230. · Zbl 1383.60008
[32] Kolmogorov, A.N. and Rozanov, J.A. (1960). On a strong mixing condition for stationary Gaussian processes. Theory Probab. Appl.5 204-208. · Zbl 0106.12005
[33] Kontorovich, L. (2006). Metric and mixing sufficient conditions for concentration of measure. Available at arxiv.org/abs/math/0610427.
[34] Kontorovich, L. and Ramanan, K. (2008). Concentration inequalities for dependent random variables via the martingale method. Ann. Probab.36 2126-2158. · Zbl 1154.60310
[35] Marton, K. (2004). Measure concentration for Euclidean distance in the case of dependent random variables. Ann. Probab.32 2526-2544. · Zbl 1071.60012
[36] Maume-Deschamps, V. (2006). Exponential inequalities and functional estimations for weak dependent data; applications to dynamical systems. Stoch. Dyn.6 535-560. · Zbl 1130.37319
[37] Mc Leish, D. (1975). Invariance principles and mixing random variables. Econometric Theory4 165-178.
[38] Merlevède, F., Peligrad, M. and Rio, E. (2009). Bernstein inequality and moderate deviations under strong mixing conditions. In High Dimensional Probability V: The Luminy Volume. Inst. Math. Stat. (IMS) Collect.5 273-292. Beachwood, OH: IMS. · Zbl 1243.60019
[39] Micchelli, C.A. and Pontil, M. (2004). A function representation for learning in Banach spaces. In Learning Theory. Lecture Notes in Computer Science3120 255-269. Berlin: Springer. · Zbl 1078.68129
[40] Pinelis, I. (1992). An approach to inequalities for the distributions of infinite-dimensional martingales. In Probability in Banach Spaces, 8 (Brunswick, ME, 1991). Progress in Probability30 128-134. Boston, MA: Birkhäuser. · Zbl 0793.60016
[41] Pinelis, I. (1994). Optimum bounds for the distributions of martingales in Banach spaces. Ann. Probab.22 1679-1706. · Zbl 0836.60015
[42] Pinelis, I.F. and Sakhanenko, A.I. (1986). Remarks on inequalities for probabilities of large deviations. Theory Probab. Appl.30 143-148. · Zbl 0583.60023
[43] Potapov, D. and Sukochev, F. (2014). Fréchet differentiability of \(\mathcal{S}^p\) norms. Adv. Math.262 436-475. · Zbl 1311.46043
[44] Rio, E. (1996). Sur le théorème de Berry-Esseen pour les suites faiblement dépendantes. Probab. Theory Related Fields104 255-282. · Zbl 0838.60017
[45] Rio, E. (2013). Extensions of the Hoeffding-Azuma inequalities. Electron. Commun. Probab.18 no. 54, 6. · Zbl 1300.60036
[46] Rosasco, L., Belkin, M. and De Vito, E. (2010). On learning with integral operators. J. Mach. Learn. Res.2 905-934. · Zbl 1242.62059
[47] Rosenblatt, M. (1956). A central limit theorem and a strong mixing condition. Proc. Natl. Acad. Sci. USA42 43-47. · Zbl 0070.13804
[48] Samson, P.-M. (2000). Concentration of measure inequalities for Markov chains and \(\Phi \) -mixing processes. Ann. Probab.28 416-461. · Zbl 1044.60061
[49] Song, G. and Zhang, H. (2011). Reproducing kernel Banach spaces with the \(\ell^1\) norm II: Error analysis for regularized least square regression. Neural Comput.23 2713-2729. · Zbl 1231.68219
[50] Sriperumbudur, B., Fukumizu, K. and Lanckriet, G. (2011). Learning in Hilbert vs. Banach spaces: A measure embedding viewpoint. In Advances in Neural Information Processing Systems 24 (NIPS 2011) (J. Shawe-Taylor, R.S. Zemel, P.L. Bartlett, F. Pereira and K.Q. Weinberger, eds.) 1773-1781.
[51] Steinwart, I. (2009). Two oracle inequalities for regularized boosting classifiers. Stat. Interface2 271-284. · Zbl 1245.68161
[52] van de Geer, S.A. (2002). On Hoeffding’s inequality for dependent random variables. In Empirical Process Techniques for Dependent Data 161-169. Boston, MA: Birkhäuser. · Zbl 1027.60013
[53] Wintenberger, O. (2010). Deviation inequalities for sums of weakly dependent time series. Electron. Commun. Probab.15 489-503. · Zbl 1225.60034
[54] Yurinskyi, V. (1970). The infinite-dimensional version of S.N. Bernšteĭn’s inequalities. Theory Probab. Appl.15 108-109.
[55] Yurinsky, V. (1995). Sums and Gaussian Vectors. Lecture Notes in Math.1617. Berlin: Springer. · Zbl 0846.60003
[56] Zhang, H., Xu, Y. and Zhang, J. (2009). Reproducing kernel Banach spaces for machine learning. J. Mach. Learn. Res.10 2741-2775. · Zbl 1235.68217
[57] Zhang, H. and Zhang, J. (2013). Vector-valued reproducing kernel Banach spaces with applications to multi-task learning. J. Complexity29 195-215. · Zbl 1323.46030
[58] Zhang, T. (2002). On the dual formulation of regularized learning schemes with convex risks. Mach. Learn.46 91-129. · Zbl 0998.68100
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.