×

Mean field variational Bayes for continuous sparse signal shrinkage: pitfalls and remedies. (English) Zbl 1298.62050

Summary: We investigate mean field variational approximate Bayesian inference for models that use continuous distributions, Horseshoe, Negative-Exponential-Gamma and Generalized Double Pareto, for sparse signal shrinkage. Our principal finding is that the most natural, and simplest, mean field variational Bayes algorithm can perform quite poorly due to posterior dependence among auxiliary variables. More sophisticated algorithms, based on special functions, are shown to be superior. Continued fraction approximations via Lentz’s Algorithm are developed to make the algorithms practical.

MSC:

62F15 Bayesian inference
62J07 Ridge regression; shrinkage estimators (Lasso)
PDFBibTeX XMLCite
Full Text: DOI Euclid

References:

[1] Abramowitz, M. and Stegun, I.A. (Eds.) (1972)., Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables . New York: Dover Publications. · Zbl 0543.33001
[2] Archambeau, C. and Bach, F. (2008). Sparse probabilistic projections., 21st Annual Conference on Neural Information Processing Systems, Vancouver, Canada, December 8-11 .
[3] Armagan, A. (2009). Variational bridge regression., Journal of Machine Learning Research, Workshop and Conference Proceedings , 5 , 17-24.
[4] Armagan, A., Dunson, D.B. and Clyde, M. (2011). Generalized beta mixtures of Gaussians. In, Advances in Neural Information Processing Systems 24 , J. Shawe-Taylor, R.S. Zamel, P. Bartlett, F. Pereira and K.Q. Weinberger (Eds.), 523-531.
[5] Armagan, A., Dunson, D.B. and Lee, J. (2013). Generalized double Pareto shrinkage., Statistica Sinica , 23 , 119-143. · Zbl 1259.62061
[6] Attias, H. (1999). Inferring parameters and structure of latent variable models by variational Bayes., Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence , 21-30.
[7] Bishop, C.M. (2006)., Pattern Recognition and Machine Learning . New York: Springer. · Zbl 1107.68072
[8] Carbonetto, P. and Stephens, M. (2011). Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies., Bayesian Analysis , 6 (4), 1-42. · Zbl 1330.62089 · doi:10.1214/12-BA703
[9] Carvalho, C.M., Polson, N.G. and Scott, J.G. (2010). The horseshoe estimator for sparse signals., Biometrika , 97 , 465-480. · Zbl 1406.62021 · doi:10.1093/biomet/asq017
[10] Consonni, G. and Marin, J.M. (2007). Mean-field variational approximate Bayesian inference for latent variable models., Computational Statistics and Data Analysis , 52 , 790-798. · Zbl 1247.62085
[11] Cuyt, A., Petersen, V.B., Verdonk, B., Waadeland, H. and Jones, W.B. (2008)., Handbook of Continued Fractions for Special Functions . New York: Springer. · Zbl 1150.30003
[12] Flandin, G. and Penny, W.D. (2007). Bayesian fMRI data analysis with sparse spatial basis function priors., NeuroImage , 34 , 1108-1125.
[13] Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Alken, P., Booth, M. and Rossi, F. (2009)., GNU Scientific Library Reference Manual , 3rd Edition, Version 1.12, Bristol UK: Network Theory.
[14] Gradshteyn, I.S. and Ryzhik, I.M. (1994)., Tables of Integrals, Series, and Products , 5th Edition, San Diego, California: Academic Press. · Zbl 0918.65002
[15] Griffin, J.E. and Brown, P.J. (2011). Bayesian hyper-lassos with non-convex penalization., Australian and New Zealand Journal of Statistics , 53 , 423-442. · Zbl 1335.62047
[16] Hankin, R.K.S. (2007). gsl 1.9. Wrapper for the Gnu Scientific Library. R package., .
[17] Johnstone, I.M. and Silverman, B.W. (2004). Needles and straw in haystacks: empirical Bayes estimates of possibly sparse sequences., The Annals of Statistics , 32 , 1594-1649. · Zbl 1047.62008 · doi:10.1214/009053604000000030
[18] Johnstone, I.M. and Silverman, B.W. (2005). Bayes selection of wavelet thresholds., The Annals of Statistics , 33 , 1700-1752. · Zbl 1078.62005 · doi:10.1214/009053605000000345
[19] Lentz, W.J. (1976). Generating Bessel functions in Mie scattering calculations using continued fractions., Applied Optics , 3 , 668-671.
[20] Ligges, U., Thomas, A., Spiegelhalter, D., Best, N., Lunn, D., Rice, K. and Sturtz, S. (2011). BRugs 0.5: OpenBUGS and its R/S-PLUS interface BRugs. R package., .
[21] Logsdon, B.A., Hoffman, G.E. and Mezey, J.G. (2010). A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis., BMC Bioinformatics , 11 :58, 1-13.
[22] Lunn, D.J., Thomas, A., Best, N. and Spiegelhalter, D. (2000). WinBUGS - a Bayesian modelling framework: concepts, structure, and extensibility., Statistics and Computing , 10 , 325-337.
[23] McGrory, C.A. and Titterington, D.M. (2007). Variational approximations in Bayesian model selection for finite mixture distributions., Computational Statistics and Data Analysis , 51 , 5352-5367. · Zbl 1445.62050
[24] Neville, S.E. (2013)., Elaborate Distribution Semiparametric Regression via Mean Field Variational Bayes . PhD Thesis, University of Wollongong.
[25] Ormerod, J.T. and Wand, M.P. (2010). Explaining variational approximations., The American Statistician , 64 , 140-153. · Zbl 1200.65007 · doi:10.1198/tast.2010.09058
[26] Polson, N.G. and Scott, J.G. (2010). Shrink globally, act locally: sparse Bayesian regularization and prediction. In, Bayesian Statistics 9 , J.M. Bernardo, M.J. Bayarri, J.O. Berger, A.P. Dawid, D. Heckerman, A.F.M. Smith and M. West (Eds.). Oxford: Oxford University Press. · doi:10.1093/acprof:oso/9780199694587.003.0017
[27] Press, W., Teukolosky, S., Vetterling, W. and Flannery, B. (1992)., Numerical Recipes: The Art of Scientific Computing , 2nd Edition. New York: Cambridge University Press. · Zbl 0778.65002
[28] R Development Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0,
[29] Teschendorff, A.E., Wang, Y., Barbosa-Morais, N.L., Brenton, J.D. and Caldas C. (2005). A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data., Bioinformatics , 21 , 3025-3033.
[30] Tipping, M.E. and Lawrence, N.D. (2003). A variational approach to robust Bayesian interpolation., IEEE Workshop on Neural Networks for Signal Processing , 229-238.
[31] Wainwright, M.J. and Jordan, M.I. (2008). Graphical models, exponential families, and variational inference., Foundation and Trends in Machine Learning , 1 , 1-305. · Zbl 1193.62107 · doi:10.1561/2200000001
[32] Wand, M.P. and Ormerod, J.T. (2011). Penalized wavelets: embedding wavelets into semiparametric regression., Electronic Journal of Statistics , 5 , 1654-1717. · Zbl 1271.62089 · doi:10.1214/11-EJS652
[33] Wand, M.P. and Ormerod, J.T. (2012). Continued fraction enhancement of Bayesian computing., Stat. , 1 , 31-41.
[34] Wand, M.P., Ormerod, J.T., Padoan, S.A. and Frühwirth, R. (2011). Mean field variational Bayes for elaborate distributions., Bayesian Analysis , 6 , 847-900. · Zbl 1330.62158 · doi:10.1214/11-BA631
[35] Wand, M.P. and Ripley, B.D. (2010). KernSmooth 2.23. Functions for kernel smoothing corresponding to the book: Wand, M.P. and Jones, M.C. (1995) “Kernel Smoothing”. R package.,
[36] Whittaker, E T. and Watson, G.N. (1990)., A Course in Modern Analysis , 4th Edition, Cambridge UK: Cambridge University Press. · JFM 45.0433.02
[37] Wuertz, D. and others. (2009). fAsianOptions 2100.76. Exponential Brownian motion and Asian option evaluation. R package.,
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.