Needles and straw in a haystack: posterior concentration for possibly sparse sequences. (English) Zbl 1257.62025

Summary: We consider full Bayesian inference in the multivariate normal mean model in the situation that the mean vector is sparse. The prior distribution on the vector of means is constructed hierarchically by first choosing a collection of nonzero means and next a prior on the nonzero values. We consider the posterior distribution in the frequentist set-up that the observations are generated according to a fixed mean vector, and are interested in the posterior distribution of the number of nonzero components and the contraction of the posterior distribution to the true mean vector. We find various combinations of priors on the number of nonzero coefficients and on these coefficients that give desirable performance. We also find priors that give suboptimal convergence, for instance, Gaussian priors on the nonzero coefficients. We illustrate the results by simulations.


62F15 Bayesian inference
62H10 Multivariate distribution of statistics
62G05 Nonparametric estimation
65C60 Computational problems in statistics (MSC2010)
62G20 Asymptotic properties of nonparametric inference


Full Text: DOI arXiv Euclid


[1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584-653. · Zbl 1092.62005 · doi:10.1214/009053606000000074
[2] Abramovich, F., Grinshtein, V. and Pensky, M. (2007). On optimality of Bayesian testimation in the normal means problem. Ann. Statist. 35 2261-2286. · Zbl 1126.62003 · doi:10.1214/009053607000000226
[3] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705-1732. · Zbl 1173.62022 · doi:10.1214/08-AOS620
[4] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. ( JEMS ) 3 203-268. · Zbl 1037.62001 · doi:10.1007/s100970100031
[5] Brown, L. D. and Greenshtein, E. (2009). Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means. Ann. Statist. 37 1685-1704. · Zbl 1166.62005 · doi:10.1214/08-AOS630
[6] Cai, T. T., Jin, J. and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures. Ann. Statist. 35 2421-2449. · Zbl 1360.62113 · doi:10.1214/009053607000000334
[7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[8] Castillo, I. (2008). Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat. 2 1281-1299. · Zbl 1320.62067 · doi:10.1214/08-EJS273
[9] Castillo, I. and van der Vaart, A. W. (2012). Supplement to “Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences.” . · Zbl 1257.62025
[10] Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over \(l_p\)-balls for \(l_q\)-error. Probab. Theory Related Fields 99 277-303. · Zbl 0802.62006 · doi:10.1007/BF01199026
[11] Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object. J. R. Stat. Soc. Ser. B Stat. Methodol. 54 41-81. With discussion and a reply by the authors. · Zbl 0788.62103
[12] George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731-747. · Zbl 1029.62008 · doi:10.1093/biomet/87.4.731
[13] Golubev, G. K. (2002). Reconstruction of sparse vectors in white Gaussian noise. Problemy Peredachi Informatsii 38 75-91. · Zbl 1024.62003 · doi:10.1023/A:1020098307781
[14] Huang, J., Ma, S. and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603-1618. · Zbl 1255.62198
[15] Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means. Ann. Statist. 37 1647-1684. · Zbl 1168.62005 · doi:10.1214/08-AOS638
[16] Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594-1649. · Zbl 1047.62008 · doi:10.1214/009053604000000030
[17] Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700-1752. · Zbl 1078.62005 · doi:10.1214/009053605000000345
[18] Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587-2619. · Zbl 1200.62020 · doi:10.1214/10-AOS792
[19] Yuan, M. and Lin, Y. (2005). Efficient empirical Bayes variable selection and estimation in linear models. J. Amer. Statist. Assoc. 100 1215-1225. · Zbl 1117.62453 · doi:10.1198/016214505000000367
[20] Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54-100. · Zbl 1064.62009 · doi:10.1214/009053604000000995
[21] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567-1594. · Zbl 1142.62044 · doi:10.1214/07-AOS520
[22] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.