Asymptotically minimax empirical Bayes estimation of a sparse normal mean vector. (English) Zbl 1302.62015

Summary: For the important classical problem of inference on a sparse high-dimensional normal mean vector, we propose a novel empirical Bayes model that admits a posterior distribution with desirable properties under mild conditions. In particular, our empirical Bayes posterior distribution concentrates on balls, centered at the true mean vector, with squared radius proportional to the minimax rate, and its posterior mean is an asymptotically minimax estimator. We also show that, asymptotically, the support of our empirical Bayes posterior has roughly the same effective dimension as the true sparse mean vector. Simulation from our empirical Bayes posterior is straightforward, and our numerical results demonstrate the quality of our method compared to others having similar large-sample properties.


62C12 Empirical decision procedures; empirical Bayes procedures
62C20 Minimax procedures in statistical decision theory
62F12 Asymptotic properties of parametric estimators


Full Text: DOI arXiv Euclid


[1] Abramovich, F. and Grinshtein, V. (2013). Estimation of a sparse group of sparse vectors., Biometrika 100 355-370. · Zbl 1284.62324 · doi:10.1093/biomet/ass082
[2] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34 584-653. · Zbl 1092.62005 · doi:10.1214/009053606000000074
[3] Babenko, A. and Belitser, E. (2010). Oracle convergence rate of posterior under projection prior and Bayesian model selection., Math. Methods Statist. 19 219-245. · Zbl 1282.62125 · doi:10.3103/S1066530710030026
[4] Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems., Ann. Statist. 27 536-561. · Zbl 0980.62039 · doi:10.1214/aos/1018031206
[5] Bhattacharya, A., Pati, D., Pillai, N. S. and Dunson, D. B. (2014). Dirichlet-Laplace priors for optimal shrinkage. Unpublished manuscript, · Zbl 1305.62124 · doi:10.1214/14-AOS1215
[6] Bogdan, M., Ghosh, J. K. and Tokdar, S. T. (2008). A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing. In, Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (N. Balakrishnan, E. Peña and M. Silvapulle, eds.) 211-230. IMS, Beachwood, OH. · doi:10.1214/193940307000000158
[7] Bogdan, M., Chakrabarti, A., Frommlet, F. and Ghosh, J. K. (2011). Asymptotic Bayes-optimality under sparsity of some multiple testing procedures., Ann. Statist. 39 1551-1579. · Zbl 1221.62012 · doi:10.1214/10-AOS869
[8] Brown, L. D. and Greenshtein, E. (2009). Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means., Ann. Statist. 37 1685-1704. · Zbl 1166.62005 · doi:10.1214/08-AOS630
[9] Cai, T. T. (2012). Minimax and adaptive inference in nonparametric function estimation., Statist. Sci. 27 31-50. · Zbl 1330.62059 · doi:10.1214/11-STS355
[10] Cai, T. T. and Jin, J. (2010). Optimal rates of convergence for estimating the null density and proportion of nonnull effects in large-scale multiple testing., Ann. Statist. 38 100-145. · Zbl 1181.62040 · doi:10.1214/09-AOS696
[11] Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation., Ann. Statist. 38 2118-2144. · Zbl 1202.62073 · doi:10.1214/09-AOS752
[12] Cai, T. T. and Zhou, H. H. (2012). Optimal rates of convergence for sparse covariance matrix estimation., Ann. Statist. 40 2389-2420. · Zbl 1373.62247 · doi:10.1214/12-AOS998
[13] Carvalho, C. M., Polson, N. G. and Scott, J. G. (2010). The horseshoe estimator for sparse signals., Biometrika 97 465-480. · Zbl 1406.62021 · doi:10.1093/biomet/asq017
[14] Castillo, I. and van der Vaart, A. (2012). Needles and straw in a haystack: Posterior concentration for possibly sparse sequences., Ann. Statist. 40 2069-2101. · Zbl 1257.62025 · doi:10.1214/12-AOS1029
[15] Dalalyan, A. S. and Tsybakov, A. B. (2008). Aggregation by exponential weighting, sharp PAC-Bayesian bounds, and sparsity., Machine Learning 72 39-61.
[16] Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over \(l_p\)-balls for \(l_q\)-error., Probab. Theory Related Fields 99 277-303. · Zbl 0802.62006 · doi:10.1007/BF01199026
[17] Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object., J. Roy. Statist. Soc. Ser. B 54 41-81. With discussion and a reply by the authors. · Zbl 0788.62103
[18] Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statist. Sci. 23 1-22. · Zbl 1327.62046 · doi:10.1214/07-STS236
[19] Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space., Statist. Sinica 20 101-148. · Zbl 1180.62080
[20] Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999). Posterior consistency of Dirichlet mixtures in density estimation., Ann. Statist. 27 143-158. · Zbl 0932.62043 · doi:10.1214/aos/1018031105
[21] Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions., Ann. Statist. 28 500-531. · Zbl 1105.62315 · doi:10.1214/aos/1016218228
[22] Jiang, W. and Tanner, M. A. (2008). Gibbs posterior for variable selection in high-dimensional classification and data mining., Ann. Statist. 36 2207-2231. · Zbl 1274.62227 · doi:10.1214/07-AOS547
[23] Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means., Ann. Statist. 37 1647-1684. · Zbl 1168.62005 · doi:10.1214/08-AOS638
[24] Jin, J. and Cai, T. T. (2007). Estimating the null and the proportional of nonnull effects in large-scale multiple comparisons., J. Amer. Statist. Assoc. 102 495-506. · Zbl 1172.62319 · doi:10.1198/016214507000000167
[25] Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences., Ann. Statist. 32 1594-1649. · Zbl 1047.62008 · doi:10.1214/009053604000000030
[26] Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds., Ann. Statist. 33 1700-1752. · Zbl 1078.62005 · doi:10.1214/009053605000000345
[27] Koenker, R. (2014). A Gaussian compound decision bakeoff., Stat. 3 12-16.
[28] Koenker, R. and Mizera, I. (2014). Convex optimization, shape constraints, compound decisions, and empirical Bayes rules., J. Amer. Statist. Assoc. 109 674-685. · Zbl 1367.62020 · doi:10.1080/01621459.2013.869224
[29] Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation., Ann. Statist. 37 4254-4278. · Zbl 1191.62101 · doi:10.1214/09-AOS720
[30] Martin, R., Mess, R. and Walker, S. G. (2014). Empirical Bayes posterior concentration in sparse high-dimensional linear models. Unpublished manuscript, · Zbl 1450.62085 · doi:10.1214/14-EJS949
[31] Martin, R. and Tokdar, S. T. (2012). A nonparametric empirical Bayes framework for large-scale multiple testing., Biostatistics 13 427-439. · Zbl 1244.62066 · doi:10.1093/biostatistics/kxr039
[32] Park, T. and Casella, G. (2008). The Bayesian lasso., J. Amer. Statist. Assoc. 103 681-686. · Zbl 1330.62292 · doi:10.1198/016214508000000337
[33] Schwartz, L. (1965). On Bayes procedures., Z. Wahrs. verw. Geb. 4 10-26. · Zbl 0158.17606 · doi:10.1007/BF00535479
[34] Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing., J. Statist. Plann. Inference 136 2144-2162. · Zbl 1087.62039 · doi:10.1016/j.jspi.2005.08.031
[35] Scott, J. G., Kelly, R. C., Smith, M. A. and Kass, R. E. (2013). False discovery rate regression: An application to neural synchrony detection in primary visual cortex. Unpublished manuscript,
[36] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions., Ann. Statist. 29 687-714. · Zbl 1041.62022 · doi:10.1214/aos/1009210686
[37] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[38] Walker, S. and Hjort, N. L. (2001). On Bayesian consistency., J. R. Stat. Soc. Ser. B Stat. Methodol. 63 811-821. · Zbl 0987.62021 · doi:10.1111/1467-9868.00314
[39] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables., J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[40] Zhang, T. (2006). From \(\epsilon\)-entropy to KL-entropy: Analysis of minimum information complexity density estimation., Ann. Statist. 34 2180-2210. · Zbl 1106.62005 · doi:10.1214/009053606000000704
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.