×

Modelling multivariate, overdispersed binomial data with additive and multiplicative random effects. (English) Zbl 07257898

Summary: When modelling multivariate binomial data, it often occurs that it is necessary to take into consideration both clustering and overdispersion, the former arising from the dependence between data, and the latter due to the additional variability in the data not prescribed by the distribution. If interest lies in accommodating both phenomena at the same time, we can use separate sets of random effects that capture the within-cluster association and the extra variability. In particular, the random effects for overdispersion can be included in the model either additively or multiplicatively. For this purpose, we propose a series of Bayesian hierarchical models that deal simultaneously with both phenomena. The proposed models are applied to bivariate repeated prevalence data for hepatitis C virus (HCV) and human immunodeficiency virus (HIV) infection in injecting drug users in Italy from 1998 to 2007.

MSC:

62-XX Statistics

Software:

CODA; JAGS; boa
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Agresti, A (2002) Categorical data analysis. 2nd edition. New York: John Wiley & Sons. · Zbl 1018.62002 · doi:10.1002/0471249688
[2] Aitkin, M, Liu, CC, Chadwick, T (2009) Bayesian model comparison and model averaging for small-area estimation. Annals of Applied Statistics, 3, 199-221. · Zbl 1160.62021 · doi:10.1214/08-AOAS205
[3] Aitkin, M (2010) Statistical inference. An integrated Bayesian/likelihood approach. Boca Raton, FL: Chapman & Hall. · Zbl 1267.62040 · doi:10.1201/EBK1420093438
[4] Booth, JG, Casella, G, Friedl, H, Hobert, JP (2003) Negative binomial loglinear mixed models. Statistical Modelling, 3, 179-91. · Zbl 1070.62058
[5] Breslow, NE (1984) Extra-Poisson variation in log-linear models. Applied Statistics, 33, 38-44. · doi:10.2307/2347661
[6] Breslow, NE, Clayton, DG (1993) Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88, 9-25. · Zbl 0775.62195
[7] Carey, V, Zeger, SL, Diggle, P (1993) Modelling multivariate binary data with alternating logistic regression. Biometrika, 80, 517-26. · Zbl 0800.62446 · doi:10.1093/biomet/80.3.517
[8] Celeux, G, Forbes, F, Robert, CP, Titterington, DM (2006) Deviance information criteria for missing data models. Bayesian Analysis, 1, 651-74. · Zbl 1331.62329 · doi:10.1214/06-BA122
[9] Chen, JJ, Ahn, H (1997) Marginal models with multiplicative variance components for overdispersed binomial data. Journal of Agricultural, Biological, and Environmental Statistics, 2, 440-50. · doi:10.2307/1400513
[10] Clayton, D (1996) Generalized linear mixed models. In Gilks, WR, Richardson, S, Spiegelhalter, DJ (eds), Markov Chain Monte Carlo in practice. London: Chapman & Hall. · Zbl 0841.62059
[11] Coutinho, RA (1998). HIV and hepatitis C among injecting drug users. BMJ (Clinical research ed.), 317(7156), 424-25. · doi:10.1136/bmj.317.7156.424
[12] Del Fava, E, Kasim, A, Usman, M, Shkedy, Z, Hens, N, Aerts, M, Bollaerts, K, Scalia Tomba, G, Vickerman, P, Sutton, AJ, Wiessing, L, Kretzschmar, M (2011) Joint modeling of HCV and HIV infections among injecting drug users in italy using repeated cross-sectional prevalence data. Statistical Communications in Infectious Diseases, 3, 1-24.
[13] Gelman, A, Rubin, DB (1992) Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457-511. · Zbl 1386.65060 · doi:10.1214/ss/1177011136
[14] Gelman, A, Meng, XL, Stern, H (1996) Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica, 6, 733-807. · Zbl 0859.62028
[15] Gelman, A (2006) Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1, 515-33. · Zbl 1331.62139 · doi:10.1214/06-BA117A
[16] Geweke, J (1992) Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In Bernado, JM, Berger, JO, Dawid, AP, Smith, AFM (eds), Bayesian Statistics 4. Oxford, UK: Clarendon Press.
[17] Hinde, J, Demétrio, CGB (1998) Overdis-persion: models and estimation. Computational Statistics and Data Analysis, 27, 151-70. · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3
[18] Lesaffre, E, Lawson, AB (2012) Bayesian biostatistics. Hoboken, NJ: John Wiley & Sons. · Zbl 1282.62057 · doi:10.1002/9781119942412
[19] McLachlan, GJ (1997) On the EM algorithm for overdispersed count data. Statistical Methods in Medical Research, 6, 76-98.
[20] Molenberghs, G, Verbeke, G (2005) Models for discrete longitudinal data. Berlin: Springer-Verlag. · Zbl 1093.62002
[21] Molenberghs, G, Verbeke, G, Demétrio, CGB (2007) An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Analysis, 13, 513-31. · Zbl 1331.62363 · doi:10.1007/s10985-007-9064-y
[22] Molenberghs, G, Verbeke, G, Demétrio, CGB, Vieira, AMC (2010) A family of generalized linear models for repeated measures with normal and conjugate random effects. Statistical Science, 25, 325-47. · Zbl 1329.62342 · doi:10.1214/10-STS328
[23] Neelon, B, O’Malley, A, Normand, S (2010) A Bayesian model for repeated measures zero-inflated count data with application to outpatient psychiatric service use. Statistical Modelling, 10, 421-39. · Zbl 07256832
[24] O’Malley, AJ, Zaslavsky, AM (2008) Domain-level covariance analysis for multilevel survey data with structured nonresponse. Journal of the American Statistical Association, 103, 1405-18. · Zbl 1286.62096 · doi:10.1198/016214508000000724
[25] Plummer, M (2003) JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20-22, Vienna, Austria. ISSN 1609-395X.
[26] Plummer, M (2011) JAGS Version 3.3. 0 user manual. International Agency for Research on Cancer.
[27] Plummer, M, Best, N, Cowles, K, Vines, K (2006) CODA: convergence diagnosis and output analysis for MCMC, R News, 6, 7-11.
[28] Plummer, M (2008) Penalized loss functions for Bayesian model comparison. Biostatistics, 9, 523-39. · Zbl 1143.62003 · doi:10.1093/biostatistics/kxm049
[29] Smith, B.J. (2007) Boa: An R package for MCMC output convergence assessment and posterior inference. Journal of Statistical Software, 21, 1-37. · doi:10.18637/jss.v021.i11
[30] Qaqish, BF, Liang, K-Y (1992) Marginal models for correlated binary responses with multiple classes and multiple levels of nesting. Biometrics, 48, 939-50. · doi:10.2307/2532359
[31] Skellam, JG (1948) A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. Journal of the Royal Statistical Society, Series B, 10, 257-61. · Zbl 0032.41903
[32] Spiegelhalter, DJ, Best, NJ, Carlin, BP, Van der Linde, A (2002) Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, series B, 64, 583-640. · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.