##
**Variational message passing for elaborate response regression models.**
*(English)*
Zbl 1416.62221

Summary: We build on recent work concerning message passing approaches to approximate fitting and inference for arbitrarily large regression models. The focus is on regression models where the response variable is modeled to have an elaborate distribution, which is loosely defined to mean a distribution that is more complicated than common distributions such as those in the Bernoulli, Poisson and Normal families. Examples of elaborate response families considered here are the Negative Binomial and \(t\) families. Variational message passing is more challenging due to some of the conjugate exponential families being non-standard and numerical integration being needed. Nevertheless, a factor graph fragment approach means the requisite calculations only need to be done once for a particular elaborate response distribution family. Computer code can be compartmentalized, including that involving numerical integration. A major finding of this work is that the modularity of variational message passing extends to elaborate response regression models.

### MSC:

62G08 | Nonparametric regression and quantile regression |

62F15 | Bayesian inference |

62J05 | Linear regression; mixed models |

### Keywords:

Bayesian computing; factor graph; generalized additive models; generalized linear mixed models; mean field variational Bayes; support vector machine classification
PDFBibTeX
XMLCite

\textit{M. W. McLean} and \textit{M. P. Wand}, Bayesian Anal. 14, No. 2, 371--398 (2019; Zbl 1416.62221)

### References:

[1] | Azzalini, A. (2017). The R package sn: The skew-normal and related distributions, such as the skew-t (version 1.5). URL http://azzalini.stat.unipd.it/SN |

[2] | Azzalini, A. and Dalla Valle, A. (1996). “The multivariate skew-normal distribution.” Biometrika, 83: 715-726. · Zbl 0885.62062 · doi:10.1093/biomet/83.4.715 |

[3] | Frühwirth-Schnatter, S., Frühwirth, R., Held, L., and Rue, H. (2009). “Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data.” Statistics and Computing, 19: 479-492. |

[4] | Frühwirth-Schnatter, S. and Pyne, S. (2010). “Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions.” Biostatistics, 11: 317-336. · Zbl 1437.62465 |

[5] | Frühwirth-Schnatter, S. and Wagner, H. (2006). “Auxiliary mixture sampling for parameter-driven models of time series of counts with applications to state space modelling.” Biometrika, 93: 827-841. · Zbl 1436.62421 · doi:10.1093/biomet/93.4.827 |

[6] | Hoffman, M. D., Blei, D. M., Wang, C., and Paisley, J. W. (2013). “Stochastic variational inference.” Journal of Machine Learning Research, 14: 1303-1347. · Zbl 1317.68163 |

[7] | Knowles, D. A. and Minka, T. (2011). “Non-conjugate variational message passing for multinomial and binary regression.” In Advances in Neural Information Processing Systems, 1701-1709. |

[8] | Kotz, S., Kozubowski, T. J., and Podgórski, K. (2001). The Laplace Distribution and Generalizations. Boston: Birkhäuser. · Zbl 0977.62003 |

[9] | Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., and Blei, D. M. (2017). “Automatic differentiation variational inference.” Journal of Machine Learning Research, 18: 1-45. · Zbl 1437.62109 |

[10] | Lachos, V. H., Ghosh, P., and Arellano-Valle, R. B. (2010). “Likelihood based inference for skew-normal independent linear mixed models.” Statistica Sinica, 303-322. · Zbl 1186.62071 |

[11] | Lange, K. L., Little, R. J. A., and Taylor, J. M. G. (1989). “Robust statistical modeling using the t distribution.” Journal of the American Statistical Association, 84: 881-896. |

[12] | Luts, J. and Ormerod, J. T. (2014). “Mean field variational Bayesian inference for support vector machine classification.” Computational Statistics & Data Analysis, 73: 163-176. · Zbl 1506.62120 · doi:10.1016/j.csda.2013.10.030 |

[13] | Luts, J. and Wand, M. P. (2015). “Variational inference for count response semiparametric regression.” Bayesian Analysis, 10: 991-1023. · Zbl 1335.62054 · doi:10.1214/14-BA932 |

[14] | McLean, M. W. and Wand, M. P. (2018). “Supplement for: Variational Message Passing for Elaborate Response Regression Models.” Bayesian Analysis. · Zbl 1416.62221 |

[15] | Minka, T. (2005). “Divergence measures and message passing.” Microsoft Research Technical Report Series, MSR-TR-2005-173: 1-17. |

[16] | Minka, T. and Winn, J. (2008). “Gates: A graphical notation for mixture models.” Microsoft Research Technical Report Series, MSR-TR-2008-185: 1-16. |

[17] | Nadarajah, S. (2008). “A new model for symmetric and skewed data.” Probability in the Engineering and Informational Sciences, 22: 261-271. · Zbl 1139.60301 · doi:10.1017/S0269964808000156 |

[18] | Ormerod, J. T. and Wand, M. P. (2010). “Explaining variational approximations.” The American Statistician, 64: 140-153. · Zbl 1200.65007 · doi:10.1198/tast.2010.09058 |

[19] | Polson, N. G. and Scott, S. L. (2011). “Data augmentation for support vector machines.” Bayesian Analysis, 6: 1-23. · Zbl 1330.62258 · doi:10.1214/11-BA601 |

[20] | R Core Team (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ |

[21] | Rue, H., Martino, S., Lindgren, F., Simpson, D., Riebler, A., and Krainski, E. (2016). The R package ‘INLA’: Functions which allow to perform full Bayesian analysis of latent Gaussian models using integrated nested Laplace approximation (version 0.0). URL http://www.r-inla.org |

[22] | Ruppert, D., Wand, M. P., and Carroll, R. J. (2003). Semiparametric Regression. New York: Cambridge University Press. · Zbl 1038.62042 |

[23] | Tipping, M. E. and Lawrence, N. D. (2003). “A variational approach to robust Bayesian interpolation.” In Institute of Electrical and Electronics Engineers Workshop of Neural Networds for Signal Processing, 229-238. |

[24] | Titsias, M. K. and Lázaro-Gredilla, M. (2014). “Doubly stochastic variational Bayes for non-conjugate inference.” Proceedings of Machine Learning Research, 32: 1971-1979. |

[25] | Verdinelli, I. and Wasserman, L. (1991). “Bayesian analysis of outlier problems using the Gibbs sampler.” Statistics and Computing, 1: 105-117. |

[26] | Wand, M. P. (2017). “Fast approximate inference for arbitrarily large semiparametric regression models via message passing (with discussion).” Journal of the American Statistical Association, 112: 137-168. · doi:10.1080/01621459.2016.1197833 |

[27] | Wand, M. P. and Ormerod, J. T. (2008). “On semiparametric regression with O’Sullivan penalized splines.” Australian & New Zealand Journal of Statistics, 50: 179-198. · Zbl 1146.62030 · doi:10.1111/j.1467-842X.2008.00507.x |

[28] | Wand, M. P., Ormerod, J. T., Padoan, S. A., and Frühwirth, R. F. (2011). “Mean field variational Bayes for elaborate distributions.” Bayesian Analysis, 6: 847-900. · Zbl 1330.62158 · doi:10.1214/11-BA631 |

[29] | Winn, J. and Bishop, C. M. (2005). “Variational message passing.” Journal of Machine Learning Research, 6: 661-694. · Zbl 1222.68332 |

[30] | Yang, Y., Wang, H. J., and He, X. (2016). “Posterior inference in Bayesian quantile regression with Asymmetric Laplace likelihood.” International Statistical Review, 84: 327-344. · Zbl 07763523 |

[31] | Yu, K. and Moyeed, R. A. (2001). “Bayesian quantile regression.” Statistics and Probability Letters, 54: 437-447. · Zbl 0983.62017 · doi:10.1016/S0167-7152(01)00124-9 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.