×

MAP model selection in Gaussian regression. (English) Zbl 1329.62051

Summary: We consider a Bayesian approach to model selection in Gaussian linear regression, where the number of predictors might be much larger than the number of observations. From a frequentist view, the proposed procedure results in the penalized least squares estimation with a complexity penalty associated with a prior on the model size. We investigate the optimality properties of the resulting model selector. We establish the oracle inequality and specify conditions on the prior that imply its asymptotic minimaxity within a wide range of sparse and dense settings for “nearly-orthogonal” and “multicollinear” designs.

MSC:

62C99 Statistical decision theory
62C10 Bayesian problems; characterization of Bayes procedures
62C20 Minimax procedures in statistical decision theory
62G05 Nonparametric estimation
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Abramovich, F., Benjamini, Y., Donoho, D.L. and Johnstone, I.M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34 , 584-653. · Zbl 1092.62005 · doi:10.1214/009053606000000074
[2] Abramovich, F., Grinshtein, V. and Pensky, M. (2007). On optimality of Bayesian testimation in the normal means problem., Ann. Statist. 35 , 2261-2286. · Zbl 1126.62003 · doi:10.1214/009053607000000226
[3] Abramovich, F., Grinshtein, V., Petsa, A. and Sapatinas, T. (2010). On Bayesian testimation and its application to wavelet thresholding., Biometrika 97 , 181-198. · Zbl 1183.62042 · doi:10.1093/biomet/asp080
[4] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. in, Second International Symposium on Information Theory. (eds. B.N. Petrov and F. Czáki). Akademiai Kiadó, Budapest, 267-281. · Zbl 0283.62006
[5] Bickel, P., Ritov, Y. and Tsybakov, A. (2009). Simultaneous analysis of Lasso and Dantzig selector., Ann. Statist. 35 , 1705-1732. · Zbl 1173.62022 · doi:10.1214/08-AOS620
[6] Birgé, L. and Massart, P. (2001). Gaussian model selection., J. Eur. Math. Soc. 3 , 203-268. · Zbl 1037.62001 · doi:10.1007/s100970100031
[7] Birgé, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection., Probab. Theory Relat. Fields 138 , 33-73. · Zbl 1112.62082 · doi:10.1007/s00440-006-0011-8
[8] Bunea, F., Tsybakov, A. and Wegkamp, M.H. (2007). Aggregation for Gaussian regression., Ann. Statist. 35 , 1674-1697. · Zbl 1209.62065 · doi:10.1214/009053606000001587
[9] Candés, E.J. (2006). Modern statistical estimation via oracle inequalities., Acta Numerica 15 , 257-325. · Zbl 1141.62001 · doi:10.1017/S0962492906230010
[10] Candés, E.J. and Tao, T. (2007). The Dantzig selector: statistical estimation when, p is much larger than n . Ann. Statist. 35 , 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[11] Chipman, H., George, E.I. and McCullogh, R.E. (2001)., The Practical Implementation of Bayesian Model Selection. IMS Lecture Notes - Monograph Series 38 . · doi:10.1214/lnms/1215540964
[12] Donoho, D.L. and Johnstone, I.M. (1994). Ideal spatial adaptation via wavelet shrinkage., Biometrika 81 , 425-455. · Zbl 0815.62019 · doi:10.1093/biomet/81.3.425
[13] Donoho, D.L. and Johnstone, I.M. (1995). Empirical atomic decomposition, unpublished manuscript .
[14] Foster, D.P. and George, E.I. (1994). The risk inflation criterion for multiple regression., Ann. Statist. 22 , 1947-1975. · Zbl 0829.62066 · doi:10.1214/aos/1176325766
[15] George, E.I. and McCullogh, R.E. (1993). Variable selection via Gibbs sampling., J. Am. Statist. Assoc. 88 , 881-889.
[16] George, E.I. and McCullogh, R.E. (1997). Approaches to Bayesian variable selection., Statistica Sinica 7 , 339-373. · Zbl 0884.62031
[17] Greenshstein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization., Bernoulli 10 , 971-988. · Zbl 1055.62078 · doi:10.3150/bj/1106314846
[18] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise., Ann. Statist. 38 , 1681-1732. · Zbl 1189.62080 · doi:10.1214/09-AOS764
[19] Johnstone, I.M. (2002)., Function Estimation and Gaussian Sequence Models , unpublished manuscript . · Zbl 1037.91527
[20] Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J.O. (2008). Mixtures of, g priors for Bayesian variable selection. J. Am. Statist. Assoc. 103 , 410-423. · Zbl 1335.62026 · doi:10.1198/016214507000001337
[21] Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data., Ann. Statist. 37 , 246-270. · Zbl 1155.62050 · doi:10.1214/07-AOS582
[22] Raskutti, G., Wainwright, M.J. and Yu, B. (2009). Minimax rates of estimations for high-dimensional regression over, l q balls. Technical Report, UC Berkeley , http://arxiv.org/abs/0910/2042.
[23] Rigollet, P. and Tsybakov, A. (2010). Exponential screening and optimal rates of sparse estimation., http://arxiv.org/pdf/1003.2654. · Zbl 1215.62043 · doi:10.1214/10-AOS854
[24] Schwarz, G. (1978). Estimating the dimension of a model., Ann. Statist. 6 , 461-464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[25] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso., J. Roy. Statist. Soc. Ser. B 58 , 267-288. · Zbl 0850.62538
[26] Tropp, J.A. and Wright, S.J. (2010). Computational methods for sparse solution of linear inverse problems., Proc. IEEE, special issue “Applications of sparse representation and comprehensive sensing” .
[27] Tsybakov, A. (2009)., Introduction to Nonparametric Estimation . Springer. · Zbl 1176.62032
[28] Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with, g -prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finietti (eds. Goel, P.K. and Zellner, A.), North-Holland, Amsterdam, 233-243. · Zbl 0655.62071
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.