×

High-dimensional CLT: improvements, non-uniform extensions and large deviations. (English) Zbl 1475.60054

Summary: Central limit theorems (CLTs) for high-dimensional random vectors with dimension possibly growing with the sample size have received a lot of attention in the recent times. V. Chernozhukov et al. [Ann. Probab. 45, No. 4, 2309–2352 (2017; Zbl 1377.60040)] proved a Berry-Esseen type result for high-dimensional averages for the class of sparsely convex sets including hyperrectangles as a special case and they proved that the rate of convergence can be upper bounded by \(n^{-1/6}\) up to a polynomial factor of \(\log p\) (where \(n\) represents the sample size and \(p\) denotes the dimension). Convergence to zero of the bound requires \(\log^7p=o(n)\). We improve upon their result, for hyperrectangles, which only requires \(\log^4p=o(n)\) (in the best case). This improvement is made possible by a sharper dimension-free anti-concentration inequality for Gaussian process on a compact metric space. In addition, we prove two non-uniform variants of the high-dimensional CLT based on the large deviation and non-uniform CLT results for random variables in a Banach space by Bentkus, Rackauskas, and Paulauskas. We apply our results in the context of post-selection inference in linear regression and of empirical processes.

MSC:

60F05 Central limit and other weak theorems
60F10 Large deviations
60B05 Probability measures on topological spaces
60G15 Gaussian processes

Citations:

Zbl 1377.60040
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Bachoc, F., Blanchard, G. and Neuvial, P. (2018). On the post selection inference constant under restricted isometry properties. Electron. J. Stat. 12 3736-3757. Zentralblatt MATH: 1406.62074
Digital Object Identifier: doi:10.1214/18-EJS1490
· Zbl 1406.62074 · doi:10.1214/18-EJS1490
[2] Banerjee, D., Kuchibhotla, A.K. and Mukherjee, S. (2018). Cramér-type large deviation and non-uniform central limit theorems in high dimensions. Preprint. Available at arXiv:1806.06153v1. arXiv: 1806.06153v1
[3] Baraud, Y. (2016). Bounding the expectation of the supremum of an empirical process over a (weak) VC-major class. Electron. J. Stat. 10 1709-1728. Zentralblatt MATH: 1385.60038
Digital Object Identifier: doi:10.1214/15-EJS1055
· Zbl 1385.60038 · doi:10.1214/15-EJS1055
[4] Belloni, A., Chernozhukov, V., Chetverikov, D., Hansen, C. and Kato, K. (2018). High-dimensional econometrics and regularized GMM. Preprint. Available at arXiv:1806.01888. arXiv: 1806.01888
[5] Bentkus, V. (1990). Smooth approximations of the norm and differentiable functions with bounded support in Banach space \(l_{\infty}^k\). Lith. Math. J. 30 223-230. Zentralblatt MATH: 0725.46009
Digital Object Identifier: doi:10.1007/BF00970805
· Zbl 0725.46009 · doi:10.1007/BF00970805
[6] Bentkus, V. (2004). A Lyapunov type bound in \(\mathbf{R}^d \). Teor. Veroyatn. Primen. 49 400-410. · Zbl 1090.60019
[7] Bentkus, V., Götze, F., Paulauskas, V. and Rackauskas, A. (2000). The accuracy of Gaussian approximation in Banach spaces. In Limit Theorems of Probability Theory 25-111. Berlin: Springer. Zentralblatt MATH: 0960.60008
· Zbl 0960.60008
[8] Bentkus, V. and Rackauskas, A. (1990). On probabilities of large deviations in Banach spaces. Probab. Theory Related Fields 86 131-154. Zentralblatt MATH: 0678.60005
Digital Object Identifier: doi:10.1007/BF01474639
· Zbl 0678.60005 · doi:10.1007/BF01474639
[9] Bentkus, V.Y. (1987). Large deviations in Banach spaces. Theory Probab. Appl. 31 627-632. Zentralblatt MATH: 0657.60014
Digital Object Identifier: doi:10.1137/1131085
· Zbl 0657.60014 · doi:10.1137/1131085
[10] Bentkus, V.Yu. (1985). Lower bounds for the rate of convergence in the central limit theorem in Banach spaces. Liet. Mat. Rink. 25 10-21. Zentralblatt MATH: 0588.60010
· Zbl 0588.60010
[11] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference. Ann. Statist. 41 802-837. Zentralblatt MATH: 1267.62080
Digital Object Identifier: doi:10.1214/12-AOS1077
Project Euclid: euclid.aos/1369836961
· Zbl 1267.62080 · doi:10.1214/12-AOS1077
[12] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786-2819. Zentralblatt MATH: 1292.62030
Digital Object Identifier: doi:10.1214/13-AOS1161
Project Euclid: euclid.aos/1387313390
· Zbl 1292.62030 · doi:10.1214/13-AOS1161
[13] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014). Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 1564-1597. Zentralblatt MATH: 1317.60038
Digital Object Identifier: doi:10.1214/14-AOS1230
Project Euclid: euclid.aos/1407420009
· Zbl 1317.60038 · doi:10.1214/14-AOS1230
[14] Chernozhukov, V., Chetverikov, D. and Kato, K. (2015). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Related Fields 162 47-70. Zentralblatt MATH: 1319.60072
Digital Object Identifier: doi:10.1007/s00440-014-0565-9
· Zbl 1319.60072 · doi:10.1007/s00440-014-0565-9
[15] Chernozhukov, V., Chetverikov, D. and Kato, K. (2017). Detailed proof of Nazarov’s inequality. Preprint. Available at arXiv:1711.10696. arXiv: 1711.10696
[16] Chernozhukov, V., Chetverikov, D. and Kato, K. (2017). Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45 2309-2352. Zentralblatt MATH: 1377.60040
Digital Object Identifier: doi:10.1214/16-AOP1113
Project Euclid: euclid.aop/1502438428
· Zbl 1377.60040 · doi:10.1214/16-AOP1113
[17] Chernozhukov, V., Chetverikov, D., Kato, K. and Koike, Y. (2019). Improved central limit theorem and bootstrap approximations in high dimensions. Preprint. Available at arXiv:1912.10529. arXiv: 1912.10529
[18] Giné, E. (1976). Bounds for the speed of convergence in the central limit theorem in \(C(S)\). Z. Wahrsch. Verw. Gebiete 36 317-331. · Zbl 0351.60029
[19] Han, Q. (2019). Global empirical risk minimizers with “shape constraints” are rate optimal in general dimensions. Preprint. Available at arXiv:1905.12823. arXiv: 1905.12823
[20] Han, Q. and Wellner, J.A. (2019). Convergence rates of least squares regression estimators with heavy-tailed errors. Ann. Statist. 47 2286-2319. Zentralblatt MATH: 07082287
Digital Object Identifier: doi:10.1214/18-AOS1748
Project Euclid: euclid.aos/1558425646
· Zbl 1466.60033 · doi:10.1214/18-AOS1748
[21] Koike, Y. (2019). High-dimensional central limit theorems for homogeneous sums. Preprint. Available at arXiv:1902.03809. arXiv: 1902.03809
[22] Koike, Y. (2019). Notes on the dimension dependence in high-dimensional central limit theorems for hyperrectangles. Preprint. Available at arXiv:1911.00160. arXiv: 1911.00160
[23] Kuchibhotla, A.K., Brown, L.D., Buja, A., George, E.I. and Zhao, L. (2018). A model free perspective for linear regression: Uniform-in-model bounds for post selection inference. Preprint. Available at arXiv:1802.05801. arXiv: 1802.05801
[24] Kuchibhotla, A.K., Brown, L.D., Buja, A., George, E.I. and Zhao, L. (2018). Valid post-selection inference in assumption-lean linear regression. Preprint. Available at arXiv:1806.04119. arXiv: 1806.04119
[25] Kuchibhotla, A.K., Mukherjee, S. and Banerjee, D. (2020). Supplement to “High-dimensional CLT: Improvements, non-uniform extensions and large deviations.” https://doi.org/10.3150/20-BEJ1233SUPP
[26] Ledoux, M. and Talagrand, M. (2011). Probability in Banach Spaces: Isoperimetry and Processes. Classics in Mathematics. Berlin: Springer. Zentralblatt MATH: 1226.60003
· Zbl 1226.60003
[27] Lopes, M.E., Lin, Z. and Müller, H.-G. (2020). Bootstrapping max statistics in high dimensions: Near-parametric rates under weak variance decay and application to functional and multinomial data. Ann. Statist. 48 1214-1229. Zentralblatt MATH: 07241587
Digital Object Identifier: doi:10.1214/19-AOS1844
Project Euclid: euclid.aos/1590480052
· Zbl 1464.62266 · doi:10.1214/19-AOS1844
[28] Nazarov, F. (2003). On the maximal perimeter of a convex set in \({\Bbb{R}}^n\) with respect to a Gaussian measure. In Geometric Aspects of Functional Analysis. Lecture Notes in Math. 1807 169-187. Berlin: Springer. Zentralblatt MATH: 1036.52014
· Zbl 1036.52014
[29] Norvaiša, R. and Paulauskas, V. (1991). Rate of convergence in the central limit theorem for empirical processes. J. Theoret. Probab. 4 511-534. Zentralblatt MATH: 0734.60020
Digital Object Identifier: doi:10.1007/BF01210322
· Zbl 0734.60020 · doi:10.1007/BF01210322
[30] Paulauskas, V. and Rackauskas, A. (1989). Approximation Theory in the Central Limit Theorem: Exact Results in Banach Spaces. Mathematics and Its Applications (Soviet Series) 32. Dordrecht: Kluwer Academic. Zentralblatt MATH: 0715.60023
· Zbl 0715.60023
[31] Paulauskas, V. and Rackauskas, A. (1991). Nonuniform estimates in the central limit theorem in Banach spaces. Liet. Mat. Rink. 31 483-496. Zentralblatt MATH: 0769.60005
· Zbl 0769.60005
[32] Paulauskas, V. and Rackauskas, A. (2012). Approximation Theory in the Central Limit Theorem: Exact Results in Banach Spaces. Mathematics and Its Applications 32. Berlin: Springer. Zentralblatt MATH: 0715.60023
· Zbl 0715.60023
[33] Petrov, V.V. (1995). Limit Theorems of Probability Theory: Sequences of Independent Random Variables. Oxford Studies in Probability 4. New York: The Clarendon Press. Zentralblatt MATH: 0826.60001
· Zbl 0826.60001
[34] Saulis, L. and Statulevicius, V.A. (1991). Limit Theorems for Large Deviations. Mathematics and Its Applications (Soviet Series) 73. Dordrecht: Kluwer Academic. · Zbl 0744.60028
[35] Sazonov, V.V. (1981). Normal Approximation - Some Recent Advances. Lecture Notes in Math. 879. Berlin: Springer. Zentralblatt MATH: 0462.60006
· Zbl 0462.60006
[36] Sazonov, V.V. and Ul’yanov, V.V. (1982). On the accuracy of normal approximation. J. Multivariate Anal. 12 371-384. Zentralblatt MATH: 0499.60022
Digital Object Identifier: doi:10.1016/0047-259X(82)90072-0
· Zbl 0499.60022 · doi:10.1016/0047-259X(82)90072-0
[37] Sun, Q. (2020). Gaussian approximations for maxima of random vectors under \((2+\iota)\)-th moments. Statist. Probab. Lett. 158 Art. ID 108523. Zentralblatt MATH: 07153448
· Zbl 1459.62025
[38] Talagrand, M. (2014). Upper and Lower Bounds for Stochastic Processes: Modern Methods and Classical Problems. Ergebnisse der Mathematik und Ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] 60. Heidelberg: Springer. Zentralblatt MATH: 1293.60001
· Zbl 1293.60001
[39] van de Geer, S. and Wainwright, M.J. (2017). On concentration for (regularized) empirical risk minimization. Sankhya A 79 159-200. Zentralblatt MATH: 1380.62085
Digital Object Identifier: doi:10.1007/s13171-017-0111-9
· Zbl 1380.62085 · doi:10.1007/s13171-017-0111-9
[40] van der Vaart, A. · Zbl 0862.60002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.