×

zbMATH — the first resource for mathematics

Bayesian semiparametric latent variable model with DP prior for joint analysis: implementation with nimble. (English) Zbl 07290027
Summary: Multiple responses of mixed types are naturally encountered in a variety of data analysis problems, which should be jointly analysed to achieve higher efficiency gains. As an efficient approach for joint modelling, the latent variable model induces dependence among the mixed outcomes through a shared latent variable. Generally, the latent variable is assumed to be normal, which is not that flexible and realistic in practice. This tutorial article demonstrates how to jointly analyse mixed continuous and ordinal responses using a semiparametric latent variable model by allowing the latent variable to follow a Dirichlet process (DP) prior, and illustrates how to implement Bayesian inference through a powerful R package nimble. Two model comparison criteria, deviance information criterion (DIC) and logarithm of the pseudo-marginal likelihood (LPML), are employed for model selection. Simulated data and data from a social survey study are used for illustrating the proposed method with nimble. An extension of DP prior to DP mixtures prior is introduced as well.
MSC:
62 Statistics
Software:
CODA; JAGS; nimble; R; Stan; WinBUGS
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Antoniak, CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2, 1152-74. · Zbl 0335.60034
[2] Azzalini, A (1985) A class of distributions which includes the normal ones. Scandinavian Journal of Statistics, 12, 171-78. · Zbl 0581.62014
[3] Azzalini, A, Capitanio, A (1999) Statistical applications of the multivariate skew normal distribution. Journal of the Royal Statistical Society: Statistical Methodology, Series B, 61, 579-602. · Zbl 0924.62050
[4] Baghfalaki, T, Ganjali, M (2011) An em estimation approach for analyzing bivariate skew normal data with non monotone missing values. Communications in Statistics: Theory and Methods, 40, 1671-86. · Zbl 1220.62062
[5] Celeux, G, Forbes, F, Robert, CP, Titterington, DM (2006) Deviance information criteria for missing data models. Bayesian Analysis, 1, 651-73. · Zbl 1331.62329
[6] Valpine, P, Turek, D, Paciorek, CJ, Anderson-Bergman, C, Lang, DT, Bodik, R (2017) Programming with models: Writing statistical algorithms for general model structures with nimble. Journal of Computational and Graphical Statistics, 26, 403-13.
[7] Escobar, MD (1994) Estimating normal means with a Dirichlet process prior. Journal of the American Statistical Association, 89, 268-77. · Zbl 0791.62039
[8] Ferguson, TS (1973) A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 1, 209-30. · Zbl 0255.62037
[9] Gelman, A, Rubin, DB (1992) Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457-72. · Zbl 1386.65060
[10] Gill, J, Casella, G (2009) Nonparametric priors for ordinal Bayesian social science models: Specification and estimation. Journal of the American Statistical Association, 104, 453-54. · Zbl 1388.62377
[11] Hwang, BS, Pennell, ML (2014) Semiparametric Bayesian joint modeling of a binary and continuous outcome with applications in toxicological risk assessment. Statistics in Medicine, 33, 1162-75.
[12] Ibrahim, JG, Chen, M-H, Sinha, D (2001) Criterion-based methods for Bayesian model assessment. Statistica Sinica, 11, 419-43. · Zbl 1037.62017
[13] Ishwaran, H, James, LF (2001) Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96, 161-73. · Zbl 1014.62006
[14] Ishwaran, H, Zarepour, M (2000) Markov Chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. Biometrika, 87, 371-90. · Zbl 0949.62037
[15] Kano, Y, Berkane, M, Bentler, PM (1993) Statistical inference based on pseudo- maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association, 88, 135-43. · Zbl 0771.62044
[16] Lee, S-Y, Lu, B, Song, X-Y (2008) Semiparamtric Bayesian analysis for structural equation model with fixed covariates. Statistics in Medicine, 27, 2341-60.
[17] Lee, S-Y, Xia, Y-M (2006) Maximum likelihood methods in treating outliers and symmetrically heavy-tailed distributions for nonlinear structural equation models with missing data. Psychometrika, 71, 565-85. · Zbl 1306.62462
[18] Lin, TI, Ho, HJ, Chen, CL (2009) Analysis of multivariate skew normal models with incomplete data. Journal of Multivariate Analysis, 100, 2337-51. · Zbl 1175.62054
[19] Lu, X, Huang, Y (2014) Bayesian analysis of nonlinear mixed-effects mixture models for longitudinal data with heterogeneity and skewness. Statistics in Medicine, 33, 2830-49.
[20] Ma, Z, Chen, G (2018) Bayesian methods for dealing with missing data problems. Journal of the Korean Statistical Society, 47, 297-313. · Zbl 1395.62055
[21] MacEachern, SN, Müller, P (1998) Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7, 223-38.
[22] McCulloch, C (2008) Joint modelling of mixed outcome types using latent variables. Statistical Methods in Medical Research, 17, 53-73. · Zbl 1154.62339
[23] Moustaki, I, Knott, M (2000) Generalized latent trait models. Psychometrika, 65, 391-411. · Zbl 1291.62236
[24] Müller, P, Quintana, FA, Jara, A, Hanson, T (2015) Bayesian Nonparametric Data Analysis. Berlin: Springer. · Zbl 1333.62003
[25] Plummer, M (2003) Jags: A program for analysis of Bayesian graphical models using Gibbs sampling. In Proceedings of the 3rd international workshop on ‘Distributed Statistical Computing‘, edited by K Hornik, F Leisch and A Zeileis, 20-22 March 2003, Vienna, Austria. URL https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf (last accessed 11 December 2018).
[26] Plummer, M, Best, N, Cowles, K, Vines, K, Plummer, MM (2006) The coda package. International Agency for Research on Cancer, France (cited 30 December 2004) URL: http://www-fis.iarc.fr/coda (last accessed 11 December 2018).
[27] Rodriguez, A, Müller, P (2013) Nonparametric Bayesian inference. NSF-CBMS Reg- ional Conference Series in Probability and Statistics, 9, i 110. URL http://[www.jstor.org/stable/nsfcbmsreg][conf.9.01] (last accessed 7 December 2018). · Zbl 1317.62045
[28] Sammel, MD, Legler, JM (1997) Latent variable models for mixed discrete and continuous outcomes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59, 667-78. · Zbl 0889.62043
[29] Sethuraman, J (1994) A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639-50. · Zbl 0823.62007
[30] Shapiro, A, Browne, MW (1987) Analysis of covariance structures under elliptical distributions. Journal of the American Statistical Association, 82, 1092-97. · Zbl 0645.62056
[31] Song, X-K, Song, PX-K (2007) Correlated Data Analysis: Modeling, Analytics, and Applications. Berlin: Springer Science + Business Media. · Zbl 1132.62002
[32] Spiegelhalter, DJ, Thomas, A, Best, NG, Lun, D (2003) WinBUGS Version 1.4.1 User Manual. MRC Biostatistics Unit, University of Cambridge. URL https://www.mrcbsu.cam.ac.uk/software/bugs/the-bugs-project-winbugs/ (last accessed 7 December 2018).
[33] Team, SD (2017) Stan Modeling Language Users Guide and Reference Manual Version 2.17.0. URL http://mc-stan.org (last accessed 7 December 2018).
[34] Teimourian, M, Baghfalaki, T, Ganjali, M, Berridge, D (2015) Joint modeling of mixed skewed continuous and ordinal longitudinal responses: A Bayesian approach. Journal of Applied Statistics, 42, 2233-56. · Zbl 07269689
[35] Wu, B (2013) Contributions to Copula Modeling of Mixed Discrete-Continuous Outcomes. PhD thesis, University of Calgary, Calgary, Canada.
[36] Xia, Y, Gou, J (2016) Bayesian semiparametric analysis for latent variable models with mixed continuous and ordinal outcomes. Journal of the Korean Statistical Society, 45, 451-65. · Zbl 1342.62103
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.