×

zbMATH — the first resource for mathematics

Hierarchical Bayes, maximum a posteriori estimators, and minimax concave penalized likelihood estimation. (English) Zbl 1337.62172
Summary: Priors constructed from scale mixtures of normal distributions have long played an important role in decision theory and shrinkage estimation. This paper demonstrates equivalence between the maximum aposteriori estimator constructed under one such prior and Zhang’s minimax concave penalization estimator. This equivalence and related multivariate generalizations stem directly from an intriguing representation of the minimax concave penalty function as the Moreau envelope of a simple convex function. Maximum aposteriori estimation under the corresponding marginal prior distribution, a generalization of the quasi-Cauchy distribution proposed by Johnstone and Silverman, leads to thresholding estimators having excellent frequentist risk properties.

MSC:
62J07 Ridge regression; shrinkage estimators (Lasso)
62C20 Minimax procedures in statistical decision theory
65C60 Computational problems in statistics (MSC2010)
62F15 Bayesian inference
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Abramowitz, M. and Stegun, I. (1970). Handbook of mathematical functions . Dover Publications Inc., New York. · Zbl 0171.38503
[2] Antoniadis, A. and Fan, J. (2001). Regularization of Wavelet Approximations. J. Am. Statist. Assoc. 96 939-955. · Zbl 1072.62561
[3] Armagan, A., Dunson, D. and Lee, J. (2011). Generalized double Pareto shrinkage. ArXiv e-prints . · Zbl 1259.62061
[4] Baricz, A. (2008). Mills’ ratio: Monotonicity patterns and functional inequalities. J. Math. Anal. Applic. 340 1362-1370. · Zbl 1138.60022
[5] Berger, J. O. and Robert, C. (1990). Subjective hierarchical Bayes estimation of a multivariate normal mean: on the frequentist interface. Ann. Statist. 18 617-651. · Zbl 0719.62043
[6] Berger, J. O. and Strawderman, W. E. (1996). Choice of hierarchical priors: admissibility in estimation of normal means. Ann. Statist. 24 931-951. · Zbl 0865.62004
[7] Berger, J. O., Strawderman, W. E. and Tang, D. (2005). Posterior Propriety and Admissibility of Hyperpriors in Normal Hierarchical Models. Ann. Statist. 33 606-646. · Zbl 1068.62005
[8] Box, G. E. P. and Tiao, G. C. (1992). Bayesian Inference in Statistical Analysis (1973 ed., Wiley Classics Library) . John Wiley and Sons, New York. · Zbl 0271.62044
[9] Breheny, P. and Huang, J. (2009). Penalized methods for bi-level variable selection. Stat. Interface 2 369-380. · Zbl 1245.62034
[10] Breheny, P. and Huang, J. (2011). Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 5 232-253. · Zbl 1220.62095
[11] Bruce, A. G. and Gao, H. Y. (1996). Applied Wavelet Analysis with S-Plus. Springer, New York. · Zbl 0857.65147
[12] Carvalho, C. M., Polson, N. G. and Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika 97 465-480. · Zbl 1406.62021
[13] Chen, M.-H., Ibrahim, J. G. and Shao, Q.-M. (2006). Posterior Propriety and Computation for the Cox Regression Model with Applications to Missing Covariates. Biometrika 93 pp. 791-807. · Zbl 1436.62476
[14] Chen, M.-H. and Shao, Q.-M. (2001). Propriety of Posterior Distribution for Dichotomous Quantal Response Models. Proceedings of the American Mathematical Society 129 pp. 293-302. · Zbl 1008.62027
[15] Cox, D. R. (1972). Regression Models and Life-Tables. Journal of the Royal Statistical Society. Series B (Methodological) 34 pp. 187-220. · Zbl 0243.62041
[16] Fan, J. and Li, R. (2001). Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. J. Am. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547
[17] Fourdrinier, D., Strawderman, W. E. and Wells, M. T. (1998). On the construction of Bayes minimax estimators. Ann. Statist. 26 660-671. · Zbl 0929.62004
[18] Gao, H. and Bruce, A. G. (1997). Waveshrink with firm shrinkage. Statist. Sinica 7 855-874. · Zbl 1067.62529
[19] Gomez-Sanchez-Manzano, E., Gomez-Villegas, M. A. and Marin, J. M. (2008). Multivariate exponential power distributions as mixtures of normal distributions with Bayesian applications. Comm. Stat. Thry. Meth. 37 972-985. · Zbl 1135.62041
[20] Griffin, J. E. and Brown, P. J. (2007). Bayesian adaptive Lassos with non-convex penalization. Technical Report, Dept. of Statistics, University of Warwick. · Zbl 1335.62047
[21] Griffin, J. E. and Brown, P. J. (2010). Inference with normal-gamma prior distributions in regression problems. Bayesian Analysis 6 171-188. · Zbl 1330.62128
[22] Hans, C. (2009). Bayesian Lasso regression. Biometrika 96 835-845. · Zbl 1179.62038
[23] Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594-1649. · Zbl 1047.62008
[24] Kass, R. E. and Wasserman, L. (1996). The Selection of Prior Distributions by Formal Rules. Journal of the American Statistical Association 91 pp. 1343-1370. · Zbl 0884.62007
[25] Mazumder, R., Friedman, J. H. and Hastie, T. (2011). SparseNet: Coordinate Descent With Nonconvex Penalties. Journal of the American Statistical Association 106 1125-1138. · Zbl 1229.62091
[26] Park, T. and Casella, G. (2008). The Bayesian Lasso. J. Am. Statist. Assoc. 103 681-686. · Zbl 1330.62292
[27] Polson, N. G. and Scott, J. G. (2011). Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction (with discussion). In Bayesian Statistics 9 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman and M. West, eds.) 501-525. Oxford University Press.
[28] Robert, C. P. (2007). The Bayesian Choice . Springer-Verlag, New York. · Zbl 1129.62003
[29] Rockafellar, R. T. and Wets, R. J. B. (2004). Variational Analysis . Springer-Verlag, Berlin. · Zbl 0888.49001
[30] Sampford, M. R. (1953). Some Inequalities on Mill’s Ratio and Related Functions. Ann. Math. Statist. 24 130-132. · Zbl 0050.13503
[31] Schifano, E. D. (2010). Topics in Penalized Estimation PhD thesis, Cornell University.
[32] Schifano, E. D., Strawderman, R. L. and Wells, M. T. (2010). Majorization-minimization algorithms for nonsmoothly penalized objective functions. Electron. J. Stat. 4 1258-1299. · Zbl 1267.65009
[33] Strawderman, W. E. (1971). Proper Bayes minimax estimators of the normal multivariate normal distribution. Ann. Math. Statist. 42 385-388. · Zbl 0222.62006
[34] Strawderman, R. L. and Wells, M. T. (2012). On Hierarchical Prior Specifications and Penalized Likelihood. In Contemporary Developments in Bayesian Analysis and Statistical Decision Theory: A Festricht for William E. Strawderman , (D. Fourdrinier, E. Marchand and A. Ruhkin, eds.) 8 154-180. Institute of Mathematical Statistics, Hayward, CA. · Zbl 1326.62159
[35] Takada, Y. (1979). Stein’s positive part estimator and Bayes estimator. Ann. Inst. Statist. Math. 31 177-183. · Zbl 0447.62010
[36] Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. J. R. Statist. Soc. B 58 267-288. · Zbl 0850.62538
[37] Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1 211-244. · Zbl 0997.68109
[38] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. B 68 49-67. · Zbl 1141.62030
[39] Zhang, C.-H. (2010). Nearly Unbiased Variable Selection Under Minimax Concave Penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120
[40] Zlobec, S. (2003). Estimating convexifiers in continuous optimization. Math. Comm. 8 129-137. · Zbl 1053.90123
[41] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509-1533. · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.