Monte Carlo co-ordinate ascent variational inference. (English) Zbl 1447.62030

Summary: In variational inference (VI), coordinate-ascent and gradient-based approaches are two major types of algorithms for approximating difficult-to-compute probability densities. In real-world implementations of complex models, Monte Carlo methods are widely used to estimate expectations in coordinate-ascent approaches and gradients in derivative-driven ones. We discuss a Monte Carlo co-ordinate ascent VI (MC-CAVI) algorithm that makes use of Markov chain Monte Carlo (MCMC) methods in the calculation of expectations required within co-ordinate ascent VI (CAVI). We show that, under regularity conditions, an MC-CAVI recursion will get arbitrarily close to a maximiser of the evidence lower bound with any given high probability. In numerical examples, the performance of MC-CAVI algorithm is compared with that of MCMC and – as a representative of derivative-based VI methods – of Black Box VI (BBVI). We discuss and demonstrate MC-CAVI’s suitability for models with hard constraints in simulated and real examples. We compare MC-CAVI’s performance with that of MCMC in an important complex model used in nuclear magnetic resonance spectroscopy data analysis – BBVI is nearly impossible to be employed in this setting due to the hard constraints involved in the model.


62F15 Bayesian inference
62F12 Asymptotic properties of parametric estimators
65C05 Monte Carlo methods
62P35 Applications of statistics to physics


Full Text: DOI arXiv


[1] Astle, W.; De Iorio, M.; Richardson, S.; Stephens, D.; Ebbels, T., A Bayesian model of NMR spectra for the deconvolution and quantification of metabolites in complex biological mixtures, J. Am. Stat. Assoc., 107, 500, 1259-1271 (2012) · Zbl 1284.62163
[2] Beaumont, MA; Zhang, W.; Balding, DJ, Approximate Bayesian computation in population genetics, Genetics, 162, 4, 2025-2035 (2002)
[3] Bertsekas, DP, Nonlinear Programming (1999), Belmont: Athena Scientific, Belmont
[4] Bishop, CM, Pattern Recognition and Machine Learning (2006), Berlin: Springer, Berlin
[5] Blei, DM; Kucukelbir, A.; McAuliffe, JD, Variational inference: a review for statisticians, J. Am. Stat. Assoc., 112, 518, 859-877 (2017)
[6] Booth, JG; Hobert, JP, Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm, J. R. Stat. Soc. Ser. B (Statistical Methodology), 61, 1, 265-285 (1999) · Zbl 0917.62058
[7] Bottou, L., Stochastic Gradient Descent Tricks, 421-436 (2012), Berlin: Springer, Berlin
[8] Casella, G.; Robert, CP, Rao-Blackwellisation of sampling schemes, Biometrika, 83, 1, 81-94 (1996) · Zbl 0866.62024
[9] Chan, K.; Ledolter, J., Monte Carlo EM estimation for time series models involving counts, J. Am. Stat. Assoc., 90, 429, 242-252 (1995) · Zbl 0819.62069
[10] Duchi, J.; Hazan, E.; Singer, Y., Adaptive subgradient methods for online learning and Stochastic optimization, J. Mach. Learn. Res., 12, Jul, 2121-2159 (2011) · Zbl 1280.68164
[11] Forbes, F.; Fort, G., Combining Monte Carlo and mean-field-like methods for inference in hidden Markov random fields, IEEE Trans. Image Process., 16, 3, 824-837 (2007)
[12] Fort, G.; Moulines, E., Convergence of the Monte Carlo expectation maximization for curved exponential families, Ann. Stat., 31, 4, 1220-1259 (2003) · Zbl 1043.62015
[13] Hao, J.; Astle, W.; De Iorio, M.; Ebbels, TM, BATMAN—an R package for the automated quantification of metabolites from nuclear magnetic resonance spectra using a Bayesian model, Bioinformatics, 28, 15, 2088-2090 (2012)
[14] Hoffman, MD; Blei, DM; Wang, C.; Paisley, J., Stochastic variational inference, J. Mach. Learn. Res., 14, 1, 1303-1347 (2013) · Zbl 1317.68163
[15] Hore, PJ, Nuclear Magnetic Resonance (2015), Oxford: Oxford University Press, Oxford
[16] Jordan, MI; Ghahramani, Z.; Jaakkola, TS; Saul, LK, An introduction to variational methods for graphical models, Mach. Learn., 37, 2, 183-233 (1999) · Zbl 0945.68164
[17] Kucukelbir, A.; Tran, D.; Ranganath, R.; Gelman, A.; Blei, DM, Automatic differentiation variational inference, J. Mach. Learn. Res., 18, 1, 430-474 (2017)
[18] Levine, RA; Casella, G., Implementations of the Monte Carlo EM algorithm, J. Comput. Graph. Stat., 10, 3, 422-439 (2001)
[19] Ranganath, R.; Gerrish, S.; Blei, D., Black box variational inference, Artif. Intell. Stat., 33, 814-822 (2014)
[20] Robbins, H.; Monro, S., A stochastic approximation method, Ann. Math. Stat., 22, 3, 400-407 (1951) · Zbl 0054.05901
[21] Ross, SM, Simulation (2002), Amsterdam: Elsevier, Amsterdam
[22] Sisson, SA; Fan, Y.; Tanaka, MM, Sequential Monte Carlo without likelihoods, Proc. Nat. Acad. Sci., 104, 6, 1760-1765 (2007) · Zbl 1160.65005
[23] Tran, M-N; Nott, DJ; Kuk, AY; Kohn, R., Parallel variational Bayes for large datasets with an application to generalized linear mixed models, J. Comput. Graph. Stat., 25, 2, 626-646 (2016)
[24] Tran, M.-N., Nguyen, D.H., Nguyen, D.: Variational Bayes on Manifolds (2019). arXiv:1908.03097
[25] Wainwright, MJ; Jordan, MI, Graphical Models, Exponential Families, and Variational Inference (2008), Hanover: Now Publishers, Inc., Hanover
[26] Wei, GC; Tanner, MA, A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms, J. Am. Stat. Assoc., 85, 411, 699-704 (1990)
[27] Wishart, DS; Tzur, D.; Knox, C.; Eisner, R.; Guo, AC; Young, N.; Cheng, D.; Jewell, K.; Arndt, D.; Sawhney, S., HMDB: the human metabolome database, Nucl. Acids Res., 35, suppl1, D521-D526 (2007)
[28] Wishart, DS; Knox, C.; Guo, AC; Eisner, R.; Young, N.; Gautam, B.; Hau, DD; Psychogios, N.; Dong, E.; Bouatra, S., HMDB: a knowledgebase for the human metabolome, Nucl. Acids Res., 37, suppl1, D603-D610 (2008)
[29] Wishart, DS; Jewison, T.; Guo, AC; Wilson, M.; Knox, C.; Liu, Y.; Djoumbou, Y.; Mandal, R.; Aziat, F.; Dong, E., HMDB 3.0—the human metabolome database in 2013, Nucl. Acids Res., 41, 1, D801-D807 (2012)
[30] Wishart, DS; Feunang, YD; Marcu, A.; Guo, AC; Liang, K.; Vázquez-Fresno, R.; Sajed, T.; Johnson, D.; Li, C.; Karu, N., HMDB 4.0: the human metabolome database for 2018, Nucl. Acids Res., 46, 1, D608-D617 (2017)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.