×

Fast and accurate variational inference for models with many latent variables. (English) Zbl 07585119

Summary: Models with a large number of latent variables are often used to utilize the information in big or complex data, but can be difficult to estimate. Variational inference methods provide an attractive solution. These methods use an approximation to the posterior density, yet for large latent variable models existing choices can be inaccurate or slow to calibrate. Here, we propose a family of tractable variational approximations that are more accurate and faster to calibrate for this case. It combines a parsimonious approximation for the parameter posterior with the exact conditional posterior of the latent variables. We derive a simplified expression for the re-parameterization gradient of the variational lower bound, which is the main ingredient of optimization algorithms used for calibration. Implementation only requires exact or approximate generation from the conditional posterior of the latent variables, rather than computation of their density. In effect, our method provides a new way to employ Markov chain Monte Carlo (MCMC) within variational inference. We illustrate using two complex contemporary econometric examples. The first is a nonlinear multivariate state space model for U.S. macroeconomic variables. The second is a random coefficients tobit model applied to two million sales by 20,000 individuals in a consumer panel. In both cases, our approximating family is considerably more accurate than mean field or structured Gaussian approximations, and faster than MCMC. Last, we show how to implement data sub-sampling in variational inference for our approximation, further reducing computation time. MATLAB code implementing the method is provided.

MSC:

62-XX Statistics
91-XX Game theory, economics, finance, and other social and behavioral sciences

Software:

ADVI; ADADELTA; bfa
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Allenby, G. M.; Rossi, P. E., Marketing models of consumer heterogeneity, J. Econometrics, 89, 1-2, 57-78 (1998) · Zbl 0959.62116
[2] Ansari, A.; Li, Y.; Zhang, J. Z., Probabilistic topic model for hybrid recommender systems: A stochastic variational Bayesian approach, Mark. Sci., 37, 6, 987-1008 (2018)
[3] Archer, E.; Park, I. M.; Buesing, L.; Cunningham, J.; Paninski, L., Black box variational inference for state space models (2015), arXiv preprint arXiv:1511.07367
[4] Betancourt, M.; Girolami, M., Hamiltonian Monte Carlo for hierarchical models, Curr. Trends Bayesian Methodol. Appl., 79, 30, 2-4 (2015)
[5] Blei, D. M.; Kucukelbir, A.; McAuliffe, J. D., Variational inference: A review for statisticians, J. Amer. Statist. Assoc., 112, 518, 859-877 (2017)
[6] Bottou, L., Large-scale machine learning with stochastic gradient descent, (Lechevallier, Y.; Saporta, G., Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT’2010) (2010), Springer), 177-187 · Zbl 1436.68293
[7] Braun, M.; McAuliffe, J., Variational inference for large-scale models of discrete choice, J. Amer. Statist. Assoc., 105, 324-335 (2010) · Zbl 1397.62103
[8] Carriero, A.; Clark, T. E.; Marcellino, M., Large Bayesian vector autoregressions with stochastic volatility and non-conjugate priors, J. Econometrics, 212, 1, 137-154 (2019) · Zbl 1452.62890
[9] Carter, C. K.; Kohn, R., On Gibbs sampling for state space models, Biometrika, 81, 3, 541-553 (1994) · Zbl 0809.62087
[10] Carvalho, C. M.; Polson, N. G.; Scott, J. G., The horseshoe estimator for sparse signals, Biometrika, 97, 2, 465-480 (2010) · Zbl 1406.62021
[11] Clark, T. E.; Ravazzolo, F., Macroeconomic forecasting performance under alternative specifications of time-varying volatility, J. Appl. Econometrics, 30, 4, 551-575 (2015)
[12] Danaher, P. J.; Danaher, T. S.; Smith, M. S.; Loaiza-Maya, R., Advertising effectiveness for multiple retailer-brands in a multimedia and multichannel environment, J. Mark. Res., 57, 445-467 (2020)
[13] Daunizeau, J.; Friston, K. J.; Kiebel, S. J., Variational Bayesian identification and prediction of stochastic nonlinear dynamic causal models, Phys. D, 238, 21, 2089-2118 (2009) · Zbl 1229.62027
[14] Domke, J., A divergence bound for hybrids of MCMC and variational inference and an application to langevin dynamics and SGVI, (Proceedings of the 34th International Conference on Machine Learning - Volume 70. Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17 (2017), JMLR.org), 1029-1038
[15] Durbin, J.; Koopman, S. J., Time Series Analysis by State Space Methods (2012), Oxford University Press · Zbl 1270.62120
[16] Gelman, A.; Hill, J., Data Analysis using Regression and Multilevel/Hierarchical Models (2006), Cambridge University Press
[17] Ghahramani, Z.; Hinton, G. E., Variational learning for switching state-space models, Neural Comput., 12, 4, 831-864 (2000)
[18] Ghosh, S.; Yao, J.; Doshi-Velez, F., Model selection in Bayesian neural networks via horseshoe priors, J. Mach. Learn. Res., 20, 182, 1-46 (2019) · Zbl 1433.68392
[19] Gunawan, D.; Tran, M.-N.; Kohn, R., Fast inference for intractable likelihood problems using variational Bayes (2017), arXiv preprint arXiv:1705.06679
[20] Han, S.; Liao, X.; Dunson, D.; Carin, L., Variational Gaussian copula inference, (Gretton, A.; Robert, C. C., Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 51 (2016), PMLR: PMLR Cadiz, Spain), 829-838
[21] Hoffman, M. D., Learning deep latent Gaussian models with Markov chain Monte Carlo, (Precup, D.; Teh, Y. W., Proceedings of the 34th International Conference on Machine Learning. Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 70 (2017), PMLR), 1510-1519
[22] Hoffman, M.; Blei, D., Stochastic structured variational inference, (Lebanon, G.; Vishwanathan, S. V.N., Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 38 (2015), PMLR: PMLR San Diego, California, USA), 361-369
[23] Hoffman, M. D.; Blei, D. M.; Wang, C.; Paisley, J., Stochastic variational inference, J. Mach. Learn. Res., 14, 1, 1303-1347 (2013) · Zbl 1317.68163
[24] Huber, F.; Koop, G.; Onorante, L., Inducing sparsity and shrinkage in time-varying parameter models, J. Bus. Econom. Statist. (2020), (in press)
[25] Hui, F. K.; Warton, D. I.; Ormerod, J. T.; Haapaniemi, V.; Taskinen, S., Variational approximations for generalized linear latent variable models, J. Comput. Graph. Statist., 26, 1, 35-43 (2017)
[26] Ingraham, J.; Marks, D., Variational inference for sparse and undirected models, (International Conference on Machine Learning (2017), PMLR), 1607-1616
[27] Karl, M.; Soelch, M.; Bayer, J.; Van der Smagt, P., Deep variational Bayes filters: Unsupervised learning of state space models from raw data (2016), arXiv preprint arXiv:1605.06432
[28] Kim, S.; Shephard, N.; Chib, S., Stochastic volatility: likelihood inference and comparison with ARCH models, Rev. Econom. Stud., 65, 3, 361-393 (1998) · Zbl 0910.90067
[29] Kingma, D. P.; Welling, M., Auto-encoding variational Bayes (2014), arXiv preprint arXiv:1312.6114
[30] Kucukelbir, A.; Tran, D.; Ranganath, R.; Gelman, A.; Blei, D. M., Automatic differentiation variational inference, J. Mach. Learn. Res., 18, 14, 1-45 (2017) · Zbl 1437.62109
[31] Li, Y.; Turner, R. E.; Liu, Q., Approximate inference with amortised MCMC (2017), arXiv preprint arXiv:1702.08343
[32] Loaiza-Maya, R.; Smith, M. S., Variational Bayes estimation of discrete-margined copula models with application to time series, J. Comput. Graph. Statist., 28, 3, 523-539 (2019) · Zbl 07499074
[33] Manchanda, P.; Rossi, P. E.; Chintagunta, P. K., Response modeling with nonrandom marketing-mix variables, J. Mark. Res., 41, 4, 467-478 (2004)
[34] Miller, A. C.; Foti, N. J.; Adams, R. P., Variational boosting: Iteratively refining posterior approximations, (International Conference on Machine Learning (2017)), 2420-2429
[35] Murray, J. S.; Dunson, D. B.; Carin, L.; Lucas, J. E., Bayesian Gaussian copula factor models for mixed data, J. Amer. Statist. Assoc., 108, 502, 656-665 (2013) · Zbl 06195968
[36] Naesseth, C. A.; Linderman, S. W.; Ranganath, R.; Blei, D. M., Variational sequential Monte Carlo (2017), arXiv preprint arXiv:1705.11140
[37] Nolan, T. H.; Menictas, M.; Wand, M. P., Streamlined computing for variational inference with higher level random effects, J. Mach. Learn. Res., 21, 1-62 (2020) · Zbl 1527.62024
[38] Oh, D. H.; Patton, A. J., Modeling dependence in high dimensions with factor copulas, J. Bus. Econom. Statist., 35, 1, 139-154 (2017)
[39] Ong, V. M.-H.; Nott, D. J.; Smith, M. S., Gaussian variational approximation with a factor covariance structure, J. Comput. Graph. Statist., 27, 3, 465-478 (2018) · Zbl 07498925
[40] Ong, V. M.; Nott, D. J.; Tran, M.-N.; Sisson, S. A.; Drovandi, C. C., Variational Bayes with synthetic likelihood, Stat. Comput., 28, 4, 971-988 (2018) · Zbl 1384.65015
[41] Ormerod, J. T.; Wand, M. P., Explaining variational approximations, Amer. Statist., 64, 2, 140-153 (2010) · Zbl 1200.65007
[42] Poyiadjis, G.; Doucet, A.; Singh, S. S., Particle approximations of the score and observed information matrix in state space models with application to parameter estimation, Biometrika, 98, 1, 65-80 (2011) · Zbl 1214.62093
[43] Quiroz, M.; Nott, D. J.; Kohn, R., Gaussian variational approximation for high-dimensional state space models (2018), arXiv:1801.07873
[44] Rezende, D. J.; Mohamed, S.; Wierstra, D., Stochastic backpropagation and approximate inference in deep generative models, (Xing, E. P.; Jebara, T., Proceedings of the 31st International Conference on Machine Learning. Proceedings of the 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 32 (2014), PMLR: PMLR Bejing, China), 1278-1286
[45] Ruiz, F.; Titsias, M., A contrastive divergence for combining variational inference and MCMC, (Chaudhuri, K.; Salakhutdinov, R., Proceedings of the 36th International Conference on Machine Learning. Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 97 (2019), PMLR: PMLR Long Beach, California, USA), 5537-5545
[46] Salimans, T., Kingma, D., Welling, M., 2015. Markov chain Monte Carlo and variational inference: Bridging the gap. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-15). pp. 1218-1226.
[47] Salimans, T.; Knowles, D. A., Fixed-form variational posterior approximation through stochastic linear regression, Bayesian Anal., 8, 4, 837-882 (2013) · Zbl 1329.62142
[48] Smith, M. S.; Loaiza-Maya, R.; Nott, D. J., High-dimensional copula variational approximation through transformation, J. Comput. Graph. Statist., 29, 4, 729-743 (2020) · Zbl 07500353
[49] Tan, L. S.L., Use of model reparametrization to improve variational Bayes (2018), arXiv preprint arXiv:1805.07267 · Zbl 07555255
[50] Tan, L. S.L.; Nott, D. J., A stochastic variational framework for fitting and diagnosing generalized linear mixed models, Bayesian Anal., 9, 4, 963-1004 (2014) · Zbl 1327.62167
[51] Titsias, M.; Lázaro-Gredilla, M., Doubly stochastic variational Bayes for non-conjugate inference, (Xing, E. P.; Jebara, T., Proceedings of the 31st International Conference on Machine Learning. Proceedings of the 31st International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 32 (2014), PMLR: PMLR Bejing, China), 1971-1979
[52] Tomasetti, N.; Forbes, C. S.; Panagiotelis, A., Updating variational Bayes: fast sequential posterior inference (2019), arXiv preprint arXiv:1908.00225 · Zbl 1477.62016
[53] Train, K. E., Discrete Choice Methods with Simulation (2009), Cambridge University Press · Zbl 1269.62073
[54] Tran, M.-N.; Nott, D. J.; Kohn, R., Variational Bayes with intractable likelihood, J. Comput. Graph. Statist., 26, 4, 873-882 (2017)
[55] Wang, Y.; Blei, D., Variational Bayes under model misspecification, (Wallach, H.; Larochelle, H.; Beygelzimer, A.; d’Alché Buc, F.; Fox, E.; Garnett, R., Advances in Neural Information Processing Systems 32 (2019), Curran Associates, Inc.), 13357-13367
[56] Wang, B.; Titterington, D., Lack of consistency of mean field and variational Bayes approximations for state space models, Neural Process. Lett., 20, 3, 151-170 (2004)
[57] Ye, L.; Beskos, A.; De Iorio, M.; Hao, J., Monte Carlo co-ordinate ascent variational inference, Stat. Comput., 30, 887-905 (2020) · Zbl 1447.62030
[58] Yeo, I.-K.; Johnson, R. A., A new family of power transformations to improve normality or symmetry, Biometrika, 87, 4, 954-959 (2000) · Zbl 1028.62010
[59] Zeiler, M. D., ADADELTA: An adaptive learning rate method (2012), arXiv:1212.5701
[60] Zhang, Y.; Hernández-Lobato, J. M., Ergodic inference: Accelerate convergence by optimisation (2018), arXiv preprint arXiv:1805.10377
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.