×

zbMATH — the first resource for mathematics

Deep variational inference. (English) Zbl 07225681
Grohs, Philipp (ed.) et al., Handbook of variational methods for nonlinear geometric data. Cham: Springer. 361-376 (2020).
Summary: This chapter begins with a review of variational inference (VI) as a fast approximation alternative to Markov Chain Monte Carlo (MCMC) methods, solving an optimization problem for approximating the posterior. VI is scaled to stochastic variational inference and generalized to black-box variational inference (BBVI). Amortized VI leads to the variational auto-encoder (VAE) framework which is introduced using deep neural networks and graphical models and used for learning representations and generative modeling. Finally, we explore generative flows, the latent space manifold, and Riemannian geometry of generative models.
For the entire collection see [Zbl 07115003].
MSC:
65Dxx Numerical approximation and computational geometry (primarily algorithms)
65Mxx Numerical methods for partial differential equations, initial value and time-dependent initial-boundary value problems
65Nxx Numerical methods for partial differential equations, boundary value problems
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Arvanitidis, G., Hansen, L.K., Hauberg, S.: Latent space oddity: on the curvature of deep generative models. In: International Conference on Learning Representations (2018)
[2] Arvanitidis, G., Hauberg, S., Hennig, P., Schober, M.: Fast and robust shortest paths on manifolds learned from data. In: International Conference on Artificial Intelligence and Statistics (2019)
[3] Behrmann, J., Duvenaud, D., Jacobsen, J.H.: Invertible residual networks. In: International Conference on Machine Learning (2019)
[4] Bellman, R.E., Kagiwada, H., Kalaba, R.E.: Wengert’s numerical method for partial derivatives, orbit determination and quasilinearization. Commun. ACM 8(4), 231-232 (1965) · Zbl 0171.38401
[5] Bingham, E., Chen, J.P., Jankowiak, M., Obermeyer, F., Pradhan, N., Karaletsos, T., Singh, R., Szerlip, P., Horsfall, P., Goodman, N.D.: Pyro: deep universal probabilistic programming. J. Mach. Learn. Res. 20(1), 973-978 (2019) · Zbl 07049747
[6] Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006) · Zbl 1107.68072
[7] Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859-877 (2017)
[8] Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: International Conference on Computational Statistics, pp. 177-186. Springer, Berlin (2010) · Zbl 1436.68293
[9] Bowman, S.R., Vilnis, L., Vinyals, O., Dai, A.M., Jozefowicz, R., Bengio, S.: Generating sentences from a continuous space. In: Conference on Computational Natural Language Learning (2016)
[10] Brooks, S., Gelman, A., Jones, G., Meng, X.-L.: Handbook of Markov Chain Monte Carlo. CRC Press, Boca Raton (2011) · Zbl 1218.65001
[11] Chen, T.Q., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary differential equations. In: Advances in Neural Information Processing Systems, pp. 6571-6583 (2018)
[12] Chen, N., Ferroni, F., Klushyn, A., Paraschos, A., Bayer, J., van der Smagt, P.: Fast approximate geodesics for deep generative models (2019). arXiv preprint arXiv:1812.08284
[13] Choi, K., Wu, M., Goodman, N., Ermon, S.: Meta-amortized variational inference and learning. In: International Conference on Learning Representations (2019)
[14] Davidson, T.R., Falorsi, L., De Cao, N., Kipf, T., Tomczak, J.M.: Hyperspherical variational auto-encoders. In: Conference on Uncertainty in Artificial Intelligence (2018)
[15] De Fauw, J., Dieleman, S., Simonyan, K.: Hierarchical autoregressive image models with auxiliary decoders (2019). arXiv preprint arXiv:1903.04933
[16] Dillon, J.V., Langmore, I., Tran, D., Brevdo, E., Vasudevan, S., Moore, D., Patton, B., Alemi, A., Hoffman, M., Saurous, R.A.: Tensorflow distributions (2017). arXiv preprint arXiv:1711.10604
[17] Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP. In: International Conference on Learning Representations (2017)
[18] Do Carmo, M.P.: Riemannian Geometry. Birkhäuser, Basel (1992)
[19] Do Carmo, M.P.: Differential Geometry of Curves and Surfaces, 2nd edn. Courier Dover Publications, New York (2016)
[20] Efron, B., Hastie, T.: Computer Age Statistical Inference. Cambridge University Press, Cambridge (2016) · Zbl 1377.62004
[21] Falorsi, L., de Haan, P., Davidson, T.R., Forré, P.: Reparameterizing distributions on Lie groups. In: Proceedings of Machine Learning Research, pp. 3244-3253 (2019)
[22] Figurnov, M., Mohamed, S., Mnih, A.: Implicit reparameterization gradients. In: Advances in Neural Information Processing Systems, pp. 441-452 (2018)
[23] Giordano, R., Broderick, T., Jordan, M.I.: Covariances, robustness and variational Bayes. J. Mach. Learn. Res. 19(1), 1981-2029 (2018) · Zbl 1467.62043
[24] Grathwohl, W., Chen, R.T., Betterncourt, J., Sutskever, I., Duvenaud, D.: FFJORD: free-form continuous dynamics for scalable reversible generative models. In: International Conference on Learning Representations (2019)
[25] Gregor, K., Danihelka, I., Graves, A., Rezende, D.J., Wierstra, D.: Draw: a recurrent neural network for image generation. In: International Conference on Machine Learning (2015)
[26] Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504-507 (2006) · Zbl 1226.68083
[27] Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303-1347 (2013) · Zbl 1317.68163
[28] Holbrook, A.: Geometric Bayes. Ph.D. thesis, UC Irvine, 2018
[29] Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183-233 (1999) · Zbl 0945.68164
[30] Kim, Y., Wiseman, S., Rush, A.M.: A tutorial on deep latent variable models of natural language (2018). arXiv preprint arXiv:1812.06834
[31] Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1 × 1 convolutions. In: Advances in Neural Information Processing Systems, pp. 10215-10224 (2018)
[32] Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: International Conference on Learning Representations (2014)
[33] Krainski, E.T., Gómez-Rubio, V., Bakka, H., Lenzi, A., Castro-Camilo, D., Simpson, D., Lindgren, F., Rue, H.: Advanced spatial modeling with stochastic partial differential equations using R and INLA. Chapman and Hall/CRC (2018) · Zbl 1418.62011
[34] Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., Blei, D.M.: Automatic differentiation variational inference. J. Mach. Learn. Res. 18(1), 430-474 (2017) · Zbl 1437.62109
[35] Li, Y., Turner, R.E.: Rényi divergence variational inference. In: Advances in Neural Information Processing Systems, pp. 1073-1081 (2016)
[36] Mallasto, A., Hauberg, S., Feragen, A.: Probabilistic Riemannian submanifold learning with wrapped Gaussian process latent variable models. In: International Conference on Artificial Intelligence and Statistics (2019)
[37] Normalizing flows on Riemannian manifolds. NeurIPS Bayesian Deep Learning Workshop (2016)
[38] O’neill, B: Elementary Differential Geometry. Elsevier, Amsterdam (2006)
[39] Opper, M., Saad, D.: Advanced Mean Field Methods: Theory and Practice. MIT Press, Cambridge (2001) · Zbl 0994.68172
[40] Parisi, G.: Statistical Field Theory. Addison-Wesley, Reading (1988) · Zbl 0984.81515
[41] Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
[42] Ranganath, R., Gerrish, S., Blei, D.: Black box variational inference. In: Artificial Intelligence and Statistics, pp. 814-822 (2014)
[43] Ravuri, S., Vinyals, O: Classification accuracy score for conditional generative models (2019). arXiv preprint arXiv:1905.10887
[44] Razavi, A., van den Oord, A., Vinyals, O.: Generating diverse high-resolution images with VQ-VAE. In: International Conference on Learning Representations Workshop (2019)
[45] Rezende, D.J., Mohamed, S.: Variational inference with normalizing flows. In: International Conference on Machine Learning (2015)
[46] Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: International Conference on Machine Learning (2014)
[47] Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400-407 (1951) · Zbl 0054.05901
[48] Roeder, G., Wu, Y., Duvenaud, D.K.: Sticking the landing: simple, lower-variance gradient estimators for variational inference. In: Advances in Neural Information Processing Systems, pp. 6925-6934 (2017)
[49] Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B Stat Methodol. 71(2), 319-392 (2009) · Zbl 1248.62156
[50] Rumelhart, D.E., Hinton, G.E., Williams, R.J., et al.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988) · Zbl 1369.68284
[51] Saha, A., Bharath, K., Kurtek, S.: A geometric variational approach to Bayesian inference. J. Am. Stat. Assoc., 1-26 (2019)
[52] Shukla, A., Uppal, S., Bhagat, S., Anand, S., Turaga, P.: Geometry of deep generative models for disentangled representations (2019). arXiv preprint arXiv:1902.06964
[53] Spivak, M.D.: A Comprehensive Introduction to Differential Geometry, 3rd edn. Publish or Perish (1999) · Zbl 1213.53001
[54] Tang, D., Ranganath, R.: The variational predictive natural gradient. In: International Conference on Machine Learning (2019)
[55] Tang, D., Liang, D., Jebara, T., Ruozzi, N.: Correlated variational autoencoders. In: International Conference on Machine Learning (2019)
[56] Ur Rahman, I., Drori, I., Stodden, V.C., Donoho, D.L., Schröder, P.: Multiscale representations for manifold-valued data. Multiscale Model. Simul. 4(4), 1201-1232 (2005) · Zbl 1236.65166
[57] van den Oord, A., Vinyals, O., et al. Neural discrete representation learning. In: Advances in Neural Information Processing Systems, pp. 6306-6315 (2017)
[58] Wainwright, M.J., Jordan, M.I., et al.: Graphical models, exponential families, and variational inference. Found. Trends^® Mach. Learn. 1(1-2), 1-305 (2008) · Zbl 1193.62107
[59] Wang, P.Z., Wang, W.Y.: Riemannian normalizing flow on variational Wasserstein autoencoder for text modeling (2019). arXiv preprint arXiv:1904.02399
[60] Wengert, R.E.: A simple automatic derivative evaluation program. Commun. ACM 7(8), 463-464 (1964) · Zbl 0131.34602
[61] Zhang, C.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.