×

High-dimensional asymptotics of likelihood ratio tests in the Gaussian sequence model under convex constraints. (English) Zbl 1485.60038

Summary: In the Gaussian sequence model \(Y=\mu +\xi \), we study the likelihood ratio test (LRT) for testing \({H_0}:\mu ={\mu_0}\) versus \({H_1}:\mu \in K\), where \({\mu_0}\in K\), and \(K\) is a closed convex set in \({\mathbb{R}^n}\). In particular, we show that under the null hypothesis, normal approximation holds for the log-likelihood ratio statistic for a general pair \(({\mu_0},K)\), in the high-dimensional regime where the estimation error of the associated least squares estimator diverges in an appropriate sense. The normal approximation further leads to a precise characterization of the power behavior of the LRT in the high-dimensional regime. These characterizations show that the power behavior of the LRT is in general nonuniform with respect to the Euclidean metric, and illustrate the conservative nature of existing minimax optimality and suboptimality results for the LRT. A variety of examples, including testing in the orthant/circular cone, isotonic regression, Lasso and testing parametric assumptions versus shape-constrained alternatives, are worked out to demonstrate the versatility of the developed theory.

MSC:

60F17 Functional limit theorems; invariance principles
62E17 Approximations to statistical distributions (nonasymptotic)

References:

[1] Amelunxen, D., Lotz, M., McCoy, M. B. and Tropp, J. A. (2014). Living on the edge: Phase transitions in convex programs with random data. Inf. Inference 3 224-294. · Zbl 1339.90251 · doi:10.1093/imaiai/iau005
[2] ARIAS-CASTRO, E., CANDÈS, E. J. and PLAN, Y. (2011). Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism. Ann. Statist. 39 2533-2556. · Zbl 1231.62136 · doi:10.1214/11-AOS910
[3] AZZALINI, A. and BOWMAN, A. (1993). On the use of nonparametric regression for checking linear relationships. J. Roy. Statist. Soc. Ser. B 55 549-557. · Zbl 0800.62222
[4] Baraud, Y. (2002). Non-asymptotic minimax rates of testing in signal detection. Bernoulli 8 577-606. · Zbl 1007.62042
[5] Baraud, Y., Huet, S. and Laurent, B. (2005). Testing convex hypotheses on the mean of a Gaussian vector. Application to testing qualitative hypotheses on a regression function. Ann. Statist. 33 214-257. · Zbl 1065.62109 · doi:10.1214/009053604000000896
[6] BARLOW, R. E., BARTHOLOMEW, D. J., BREMNER, J. M. and BRUNK, H. D. (1972). Statistical Inference Under Order Restrictions. The Theory and Application of Isotonic Regression. Wiley Series in Probability and Mathematical Statistics. Wiley, London-Sydney. · Zbl 0246.62038
[7] BARTHOLOMEW, D. J. (1959). A test of homogeneity for ordered alternatives. Biometrika 46 36-48. · Zbl 0087.14202 · doi:10.1093/biomet/46.1-2.36
[8] BARTHOLOMEW, D. J. (1959). A test of homogeneity for ordered alternatives. II. Biometrika 46 328-335. · Zbl 0090.36002 · doi:10.1093/biomet/46.3-4.328
[9] BARTHOLOMEW, D. J. (1961). Ordered tests in the analysis of variance. Biometrika 48 325-332. · Zbl 0103.36904 · doi:10.1093/biomet/48.3-4.325
[10] BARTHOLOMEW, D. J. (1961). A test of homogeneity of means under restricted alternatives. J. Roy. Statist. Soc. Ser. B 23 239-281. · Zbl 0209.50303
[11] BELLEC, P. C. and ZHANG, C.-H. (2021). Second order Stein: Sure for sure and other applications in high-dimensional inference. Ann. Statist. (to appear). Available at arXiv:1811.04121. · Zbl 1486.62209
[12] BESSON, O. (2006). Adaptive detection of a signal whose signature belongs to a cone. In Fourth IEEE Workshop on Sensor Array and Multichannel Processing, 2006 409-413. IEEE, Los Alamitos.
[13] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford. · Zbl 1337.60003 · doi:10.1093/acprof:oso/9780199535255.001.0001
[14] BOUSQUET, O. (2003). Concentration inequalities for sub-additive functions using the entropy method. In Stochastic Inequalities and Applications. Progress in Probability 56 213-247. Birkhäuser, Basel. · Zbl 1037.60015
[15] CARPENTIER, A. (2015). Testing the regularity of a smooth signal. Bernoulli 21 465-488. · Zbl 1320.94021 · doi:10.3150/13-BEJ575
[16] Carpentier, A., Collier, O., Comminges, L., Tsybakov, A. B. and Wang, Y. (2019). Minimax rate of testing in sparse linear regression. Autom. Remote Control 80 1817-1834. · Zbl 1456.62083
[17] CARPENTIER, A., COLLIER, O., COMMINGES, L., TSYBAKOV, A. B. and WANG, Y. (2020). Estimation of the \[l\text{\_}2\]-norm and testing in sparse linear regression with unknown variance. arXiv preprint. Available at arXiv:2010.13679.
[18] CARPENTIER, A. and VERZELEN, N. (2021). Optimal sparsity testing in linear regression model. Bernoulli 27 727-750. · Zbl 1478.62191 · doi:10.3150/20-bej1224
[19] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping lasso estimators. J. Amer. Statist. Assoc. 106 608-625. · Zbl 1232.62088 · doi:10.1198/jasa.2011.tm10159
[20] Chatterjee, S. (2009). Fluctuations of eigenvalues and second order Poincaré inequalities. Probab. Theory Related Fields 143 1-40. · Zbl 1152.60024 · doi:10.1007/s00440-007-0118-6
[21] Chatterjee, S. (2014). A new perspective on least squares under convex constraint. Ann. Statist. 42 2340-2381. · Zbl 1302.62053 · doi:10.1214/14-AOS1254
[22] Chatterjee, S., Guntuboyina, A. and Sen, B. (2015). On risk bounds in isotonic and other shape restricted regression problems. Ann. Statist. 43 1774-1800. · Zbl 1317.62032 · doi:10.1214/15-AOS1324
[23] CHERNOFF, H. (1954). On the distribution of the likelihood ratio. Ann. Math. Stat. 25 573-578. · Zbl 0056.37102 · doi:10.1214/aoms/1177728725
[24] CHRISTENSEN, R. and SUN, S. K. (2010). Alternative goodness-of-fit tests for linear models. J. Amer. Statist. Assoc. 105 291-301. · Zbl 1397.62248 · doi:10.1198/jasa.2009.tm08697
[25] Collier, O., Comminges, L. and Tsybakov, A. B. (2017). Minimax estimation of linear and quadratic functionals on sparsity classes. Ann. Statist. 45 923-958. · Zbl 1368.62191 · doi:10.1214/15-AOS1432
[26] COMMINGES, L. and DALALYAN, A. S. (2013). Minimax testing of a composite null hypothesis defined via a quadratic functional in the model of regression. Electron. J. Stat. 7 146-190. · Zbl 1337.62090 · doi:10.1214/13-EJS766
[27] COX, D., KOH, E., WAHBA, G. and YANDELL, B. S. (1988). Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models. Ann. Statist. 16 113-119. · Zbl 0673.62017 · doi:10.1214/aos/1176350693
[28] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962-994. · Zbl 1092.62051 · doi:10.1214/009053604000000265
[29] DUROT, C. (2007). On the \[{\mathbb{L}_p} \]-error of monotonicity constrained estimators. Ann. Statist. 35 1080-1104. · Zbl 1129.62024 · doi:10.1214/009053606000001497
[30] DUROT, C. and TOCQUET, A.-S. (2001). Goodness of fit test for isotonic regression. ESAIM Probab. Stat. 5 119-140. · Zbl 0990.62041 · doi:10.1051/ps:2001105
[31] DYKSTRA, R. (1991). Asymptotic normality for chi-bar-square distributions. Canad. J. Statist. 19 297-306. · Zbl 0736.62016 · doi:10.2307/3315395
[32] EUBANK, R. L. and SPIEGELMAN, C. H. (1990). Testing the goodness of fit of a linear model via nonparametric regression techniques. J. Amer. Statist. Assoc. 85 387-392. · Zbl 0702.62037
[33] FAN, J. and HUANG, L.-S. (2001). Goodness-of-fit tests for parametric regression models. J. Amer. Statist. Assoc. 96 640-652. · Zbl 1017.62014 · doi:10.1198/016214501753168316
[34] Giné, E. and Nickl, R. (2016). Mathematical Foundations of Infinite-Dimensional Statistical Models. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge Univ. Press, New York. · Zbl 1358.62014 · doi:10.1017/CBO9781107337862
[35] GOLDSTEIN, L., NOURDIN, I. and PECCATI, G. (2017). Gaussian phase transitions and conic intrinsic volumes: Steining the Steiner formula. Ann. Appl. Probab. 27 1-47. · Zbl 1379.60011 · doi:10.1214/16-AAP1195
[36] GRECO, M., GINI, F. and FARINA, A. (2008). Radar detection and classification of jamming signals belonging to a cone class. IEEE Trans. Signal Process. 56 1984-1993. · Zbl 1390.94550 · doi:10.1109/TSP.2007.909326
[37] Groeneboom, P. (1985). Estimating a monotone density. In Proceedings of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer, Vol. II (Berkeley, Calif., 1983). Wadsworth Statist./Probab. Ser. 539-555. Wadsworth, Belmont, CA. · Zbl 1373.62144
[38] GROENEBOOM, P., HOOGHIEMSTRA, G. and LOPUHAÄ, H. P. (1999). Asymptotic normality of the \[{L_1}\] error of the Grenander estimator. Ann. Statist. 27 1316-1347. · Zbl 1105.62342 · doi:10.1214/aos/1017938928
[39] GUERRE, E. and LAVERGNE, P. (2005). Data-driven rate-optimal specification testing in regression models. Ann. Statist. 33 840-870. · Zbl 1068.62055 · doi:10.1214/009053604000001200
[40] Guntuboyina, A. and Sen, B. (2018). Nonparametric shape-restricted regression. Statist. Sci. 33 568-594. · Zbl 1407.62135 · doi:10.1214/18-STS665
[41] HAN, Q., JIANG, T. and SHEN, Y. (2021). A general method for power analysis in testing high dimensional covariance matrices. arXiv preprint. Available at arXiv:2101.11086.
[42] HAN, Q., SEN, B. and SHEN, Y. (2022). Supplement to “High dimensional asymptotics of likelihood ratio tests in the Gaussian sequence model under convex constraints.” https://doi.org/10.1214/21-AOS2111SUPP
[43] Han, Q., Wang, T., Chatterjee, S. and Samworth, R. J. (2019). Isotonic regression in general dimensions. Ann. Statist. 47 2440-2471. · Zbl 1437.62124 · doi:10.1214/18-AOS1753
[44] HÄRDLE, W. and MAMMEN, E. (1993). Comparing nonparametric versus parametric regression fits. Ann. Statist. 21 1926-1947. · Zbl 0795.62036 · doi:10.1214/aos/1176349403
[45] Ingster, Y. I. and Suslina, I. A. (2003). Nonparametric Goodness-of-Fit Testing Under Gaussian Models. Lecture Notes in Statistics 169. Springer, New York. · Zbl 1013.62049 · doi:10.1007/978-0-387-21580-8
[46] Ingster, Y. I., Tsybakov, A. B. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat. 4 1476-1526. · Zbl 1329.62314 · doi:10.1214/10-EJS589
[47] JUDITSKY, A. and NEMIROVSKI, A. (2002). On nonparametric tests of positivity/monotonicity/convexity. Ann. Statist. 30 498-527. · Zbl 1012.62048 · doi:10.1214/aos/1021379863
[48] KATO, K. (2009). On the degrees of freedom in shrinkage estimation. J. Multivariate Anal. 100 1338-1352. · Zbl 1162.62067 · doi:10.1016/j.jmva.2008.12.002
[49] KUDÔ, A. (1963). A multivariate analogue of the one-sided test. Biometrika 50 403-418. · Zbl 0121.13906 · doi:10.1093/biomet/50.3-4.403
[50] KUDÔ, A. and CHOI, J. R. (1975). A generalized multivariate analogue of the one sided test. Mem. Fac. Sci., Kyushu Univ., Ser. A 29 303-328. · Zbl 0329.62041 · doi:10.2206/kyushumfs.29.303
[51] KUR, G., GAO, F., GUNTUBOYINA, A. and SEN, B. (2020). Convex regression in multidimensions: Suboptimality of least squares estimators. arXiv preprint. Available at arXiv:2006.02044.
[52] MCCOY, M. B. and TROPP, J. A. (2014). From Steiner formulas for cones to concentration of intrinsic volumes. Discrete Comput. Geom. 51 926-963. · Zbl 1317.52010 · doi:10.1007/s00454-014-9595-4
[53] MENÉNDEZ, J. A., RUEDA, C. and SALVADOR, B. (1992). Dominance of likelihood ratio tests under cone constraints. Ann. Statist. 20 2087-2099. · Zbl 0774.62057 · doi:10.1214/aos/1176348904
[54] MENÉNDEZ, J. A., RUEDA, C. and SALVADOR, B. (1992). Testing nonoblique hypotheses. Comm. Statist. Theory Methods 21 471-484. · Zbl 0800.62288 · doi:10.1080/03610929208830789
[55] MENÉNDEZ, J. A. and SALVADOR, B. (1991). Anomalies of the likelihood ratio tests for testing restricted hypotheses. Ann. Statist. 19 889-898. · Zbl 0734.62058 · doi:10.1214/aos/1176348126
[56] Meyer, M. and Woodroofe, M. (2000). On the degrees of freedom in shape-restricted regression. Ann. Statist. 28 1083-1104. · Zbl 1105.62340 · doi:10.1214/aos/1015956708
[57] MEYER, M. C. (2003). A test for linear versus convex regression function using shape-restricted regression. Biometrika 90 223-232. · Zbl 1034.62057 · doi:10.1093/biomet/90.1.223
[58] MUKHERJEE, R. and SEN, S. (2020). On minimax exponents of sparse testing. arXiv preprint. Available at arXiv:2003.00570.
[59] NEUMEYER, N. and VAN KEILEGOM, I. (2010). Estimating the error distribution in nonparametric multiple regression with applications to model testing. J. Multivariate Anal. 101 1067-1078. · Zbl 1185.62078 · doi:10.1016/j.jmva.2010.01.007
[60] Nickl, R. and van de Geer, S. (2013). Confidence sets in sparse regression. Ann. Statist. 41 2852-2876. · Zbl 1288.62108 · doi:10.1214/13-AOS1170
[61] NOURDIN, I. and PECCATI, G. (2012). Normal Approximations with Malliavin Calculus. Cambridge Tracts in Mathematics 192. Cambridge Univ. Press, Cambridge. From Stein’s method to universality. · Zbl 1266.60001 · doi:10.1017/CBO9781139084659
[62] RAUBERTAS, R. F., LEE, C.-I. C. and NORDHEIM, E. V. (1986). Hypothesis tests for normal means constrained by linear inequalities. Comm. Statist. Theory Methods 15 2809-2833. · Zbl 0607.62058 · doi:10.1080/03610928608829280
[63] ROBERTSON, T. and WEGMAN, E. J. (1978). Likelihood ratio tests for order restrictions in exponential families. Ann. Statist. 6 485-505. · Zbl 0391.62016
[64] Robertson, T., Wright, F. T. and Dykstra, R. L. (1988). Order Restricted Statistical Inference. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, Chichester. · Zbl 0645.62028
[65] ROCKAFELLAR, R. T. (1997). Convex Analysis. Princeton Landmarks in Mathematics. Princeton Univ. Press, Princeton, NJ. Reprint of the 1970 original, Princeton Paperbacks. · Zbl 0932.90001
[66] SEN, B. and MEYER, M. (2017). Testing against a linear regression model using ideas from shape-restricted estimation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 423-448. · Zbl 1414.62148 · doi:10.1111/rssb.12178
[67] Shah, R. D. and Bühlmann, P. (2018). Goodness-of-fit tests for high dimensional linear models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 113-135. · Zbl 06840459 · doi:10.1111/rssb.12234
[68] SHAPIRO, A. (1985). Asymptotic distribution of test statistics in the analysis of moment structures under inequality constraints. Biometrika 72 133-144. · Zbl 0596.62019 · doi:10.1093/biomet/72.1.133
[69] SHAPIRO, A. (1988). Towards a unified theory of inequality constrained testing in multivariate analysis. Int. Stat. Rev. 56 49-62. · Zbl 0661.62042 · doi:10.2307/1403361
[70] Stute, W. (1997). Nonparametric model checks for regression. Ann. Statist. 25 613-641. · Zbl 0926.62035 · doi:10.1214/aos/1031833666
[71] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[72] Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in Lasso problems. Ann. Statist. 40 1198-1232. · Zbl 1274.62469 · doi:10.1214/12-AOS1003
[73] van der Vaart, A. (2002). Semiparametric statistics. In Lectures on Probability Theory and Statistics (Saint-Flour, 1999). Lecture Notes in Math. 1781 331-457. Springer, Berlin. · Zbl 1013.62031
[74] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge. · Zbl 0943.62002 · doi:10.1017/CBO9780511802256
[75] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high dimensional phenomenons. Electron. J. Stat. 6 38-90. · Zbl 1334.62120 · doi:10.1214/12-EJS666
[76] VERZELEN, N. and VILLERS, F. (2010). Goodness-of-fit tests for high-dimensional Gaussian linear models. Ann. Statist. 38 704-752. · Zbl 1183.62074 · doi:10.1214/08-AOS629
[77] WARRACK, G. and ROBERTSON, T. (1984). A likelihood ratio test regarding two nested but oblique order-restricted hypotheses. J. Amer. Statist. Assoc. 79 881-886. · Zbl 0549.62021
[78] WEI, Y., WAINWRIGHT, M. J. and GUNTUBOYINA, A. (2019). The geometry of hypothesis testing over convex cones: Generalized likelihood ratio tests and minimax radii. Ann. Statist. 47 994-1024. · Zbl 1415.62006 · doi:10.1214/18-AOS1701
[79] Zou, H., Hastie, T. and Tibshirani, R. (2007). On the “degrees of freedom” of the Lasso. Ann. Statist. 35 2173-2192 · Zbl 1126.62061 · doi:10.1214/009053607000000127
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.