Variable selection, monotone likelihood ratio and group sparsity. (English) Zbl 07684014

Summary: In the pivotal variable selection problem, we derive the exact nonasymptotic minimax selector over the class of all \(s\)-sparse vectors, which is also the Bayes selector with respect to the uniform prior. While this optimal selector is, in general, not realizable in polynomial time, we show that its tractable counterpart (the scan selector) attains the minimax expected Hamming risk to within factor 2, and is also exact minimax with respect to the probability of wrong recovery. As a consequence, we establish explicit lower bounds under the monotone likelihood ratio property and we obtain a tight characterization of the minimax risk in terms of the best separable selector risk. We apply these general results to derive necessary and sufficient conditions of exact and almost full recovery in the location model with light tail distributions and in the problem of group variable selection under Gaussian noise and under more general anisotropic sub-Gaussian noise. Numerical results illustrate our theoretical findings.


62J05 Linear regression; mixed models
62J07 Ridge regression; shrinkage estimators (Lasso)
62C20 Minimax procedures in statistical decision theory
94A12 Signal theory (characterization, reconstruction, filtering, etc.)
Full Text: DOI arXiv


[1] ABRAHAM, K., CASTILLO, I. and ROQUAIN, E. (2021). Sharp multiple testing boundary for sparse sequences. Preprint. Available at arXiv:2109.13601.
[2] ARIAS-CASTRO, E. and CHEN, S. (2017). Distribution-free multiple testing. Electron. J. Stat. 11 1983-2001. · Zbl 1361.62023 · doi:10.1214/17-EJS1277
[3] BELITSER, E. and NURUSHEV, N. (2022). Uncertainty quantification for robust variable selection and multiple testing. Electron. J. Stat. 16. · Zbl 07633931 · doi:10.1214/22-ejs2088
[4] BUTUCEA, C., MAMMEN, E., NDAOUD, M. and TSYBAKOV, A. B. (2023). Supplement to “Variable selection, monotone likelihood ratio and group sparsity.” https://doi.org/10.1214/22-AOS2251SUPP
[5] BUTUCEA, C., NDAOUD, M., STEPANOVA, N. A. and TSYBAKOV, A. B. (2018). Variable selection with Hamming loss. Ann. Statist. 46 1837-1875. · Zbl 1414.62126 · doi:10.1214/17-AOS1572
[6] Butucea, C. and Stepanova, N. (2017). Adaptive variable selection in nonparametric sparse additive models. Electron. J. Stat. 11 2321-2357. · Zbl 1365.62133 · doi:10.1214/17-EJS1275
[7] CAI, T. T. and WANG, L. (2011). Orthogonal matching pursuit for sparse signal recovery with noise. IEEE Trans. Inf. Theory 57 4680-4688. · Zbl 1365.94058 · doi:10.1109/TIT.2011.2146090
[8] CHEN, P., GAO, C. and ZHANG, A. Y. (2022). Partial recovery for top-\(k\) ranking: Optimality of MLE and suboptimality of the spectral method. Ann. Statist. 50 1618-1652. · Zbl 07547944 · doi:10.1214/21-aos2166
[9] FLETCHER, A. K., RANGAN, S. and GOYAL, V. K. (2009). Necessary and sufficient conditions for sparsity pattern recovery. IEEE Trans. Inf. Theory 55 5758-5772. · Zbl 1367.94090 · doi:10.1109/TIT.2009.2032726
[10] GAO, C. and ZHANG, A. Y. (2022). Iterative algorithm for discrete structure recovery. Ann. Statist. 50 1066-1094. · Zbl 1486.62058 · doi:10.1214/21-aos2140
[11] GAO, Z. and STOEV, S. (2020). Fundamental limits of exact support recovery in high dimensions. Bernoulli 26 2605-2638. · Zbl 1473.62187 · doi:10.3150/20-BEJ1197
[12] GENOVESE, C. R., JIN, J., WASSERMAN, L. and YAO, Z. (2012). A comparison of the lasso and marginal regression. J. Mach. Learn. Res. 13 2107-2143. · Zbl 1435.62270
[13] INGSTER, Y. I. and STEPANOVA, N. (2014). Adaptive variable selection in nonparametric sparse regression models. J. Math. Sci. 199 184-201. · Zbl 1311.62056
[14] JI, P. and JIN, J. (2012). UPS delivers optimal phase diagram in high-dimensional variable selection. Ann. Statist. 40 73-103. · Zbl 1246.62160 · doi:10.1214/11-AOS947
[15] LOUNICI, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 90-102. · Zbl 1306.62155 · doi:10.1214/08-EJS177
[16] Lounici, K., Pontil, M., van de Geer, S. and Tsybakov, A. B. (2011). Oracle inequalities and optimal inference under group sparsity. Ann. Statist. 39 2164-2204. · Zbl 1306.62156 · doi:10.1214/11-AOS896
[17] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[18] NDAOUD, M. (2020). Scaled minimax optimality in high-dimensional linear regression: A non-convex algorithmic regularization approach. Preprint. Available at arXiv:2008.12236.
[19] NDAOUD, M. and TSYBAKOV, A. B. (2020). Optimal variable selection and adaptive noisy compressed sensing. IEEE Trans. Inf. Theory 66 2517-2532. · Zbl 1448.94081 · doi:10.1109/TIT.2020.2965738
[20] NING, Y. and CHENG, G. (2020). Sparse confidence sets for normal mean models. Preprint. Available at arXiv:2008.07107.
[21] RAD, K. R. (2011). Nearly sharp sufficient conditions on exact sparsity pattern recovery. IEEE Trans. Inf. Theory 57 4672-4679. · Zbl 1365.62203 · doi:10.1109/TIT.2011.2145670
[22] REEVE, H. W. J., CANNINGS, T. I. and SAMWORTH, R. J. (2021). Optimal subgroup selection. Preprint. Available at arXiv:2109.01077.
[23] REEVES, G., XU, J. and ZADIK, I. (2019). The all-or-nothing phenomenon in sparse linear regression. In Conference on Learning Theory 2652-2663. PMLR.
[24] REEVES, G., XU, J. and ZADIK, I. (2020). The all-or-nothing phenomenon in sparse linear regression. Math. Stat. Learn. 3 259-313. · Zbl 1493.62430 · doi:10.4171/msl/22
[25] ROQUAIN, E. and VERZELEN, N. (2022). False discovery rate control with unknown null distribution: Is it possible to mimic the oracle? Ann. Statist. 50 1095-1123. · Zbl 1486.62222 · doi:10.1214/21-aos2141
[26] SALIGRAMA, V. and ZHAO, M. (2011). Thresholded basis pursuit: LP algorithm for order-wise optimal support recovery for sparse and approximately sparse signals from noisy random measurements. IEEE Trans. Inf. Theory 57 1567-1586. · Zbl 1366.94124 · doi:10.1109/TIT.2011.2104512
[27] Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using \[{\ell_1}\]-constrained quadratic programming (Lasso). IEEE Trans. Inf. Theory 55 2183-2202. · Zbl 1367.62220 · doi:10.1109/TIT.2009.2016018
[28] Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120 · doi:10.1214/09-AOS729
[29] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563 · Zbl 1222.62008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.