General design Bayesian generalized linear mixed models. (English) Zbl 1129.62063

Summary: Linear mixed models are able to handle an extraordinary range of complications in regression-type analyses. Their most common use is to account for within-subject correlation in longitudinal data analysis. They are also the standard vehicle for smoothing spatial count data. However, when treated in full generality, mixed models can also handle spline-type smoothing and closely approximate kriging. This allows for nonparametric regression models (e.g., additive models and varying coefficient models) to be handled within the mixed model framework. The key is to allow the random effects design matrix to have a general structure; hence our label general design. For continuous response data, particularly when Gaussianity of the response is reasonably assumed, computation is now quite mature and supported by the \(R\), SAS and S-PLUS packages. Such is not the case for binary and count responses, where generalized linear mixed models (GLMMs) are required, but are hindered by the presence of intractable multivariate integrals. Software known to us supports special cases of the GLMM (e.g., PROC NLMIXED in SAS or GLMM ML in \(R\)) or relies on the sometimes crude Laplace-type approximation of integrals (e.g., the SAS macro glimmix or GLMM PQL in \(R\)).
This paper describes the fitting of general design generalized linear mixed models. A Bayesian approach is taken and Markov chain Monte Carlo (MCMC) is used for estimation and inference. In this generalized setting, MCMC requires sampling from nonstandard distributions. We demonstrate that the MCMC package WinBUGS facilitates sound fitting of general design Bayesian generalized linear mixed models in practice.


62J12 Generalized linear models (logistic models)
62F15 Bayesian inference
62G08 Nonparametric regression and quantile regression
65C40 Numerical analysis or methods applied to Markov chains
Full Text: DOI arXiv Euclid


[1] Aherns, C., Altman, N., Casella, G., Eaton, M., Hwang, J. T. G., Staudenmayer, J. and Stefansescu, C. (2001). Leukemia clusters in upstate New York: How adding covariates changes the story. Environmetrics 12 659–672.
[2] Anderson, D. A. and Aitkin, M. (1985). Variance component models with binary response: Inteviewer variability. J. Roy. Statist. Soc. Ser. B 47 203–210. JSTOR:
[3] Bedrick, E. J., Christensen, R. and Johnson, W. (1996). A new perspective on priors for generalized linear models. J. Amer. Statist. Assoc. 91 1450–1460. JSTOR: · Zbl 0882.62057
[4] Bedrick, E. J., Christensen, R. and Johnson, W. (1997). Bayesian binomial regression: Predicting survival at a trauma center. Amer. Statist. 51 211–218.
[5] Besag, J. and Green, P. J. (1993). Spatial statistics and Bayesian computation. J. Roy. Statist. Soc. Ser. B 55 25–37. JSTOR: · Zbl 0800.62572
[6] Besag, J., York, J. and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann. Inst. Statist. Math. 43 1–59. · Zbl 0760.62029
[7] Booth, J. G. and Hobert, J. P. (1998). Standard errors of prediction in generalized linear mixed models. J. Amer. Statist. Assoc. 93 262–272. JSTOR: · Zbl 1068.62516
[8] Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9–25. · Zbl 0775.62195
[9] Breslow, N. E. and Lin, X. (1995). Bias correction in generalised linear mixed models with a single component of dispersion. Biometrika 82 81–91. JSTOR: · Zbl 0823.62059
[10] Brumback, B. A., Ruppert, D. and Wand, M. P. (1999). Comment on “Variable selection and function estimation in additive nonparametric regression using a data-based prior,” by T. S. Shively, R. Kohn and S. Wood. J. Amer. Statist. Assoc. 94 794–797. JSTOR: · Zbl 0994.62033
[11] Clayton, D. (1996). Generalized linear mixed models. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 275–301. Chapman and Hall, London. · Zbl 0841.62059
[12] Cohen, S. (1988). Psychosocial models of the role of social support in the etiology of physical disease. Health Psychology 7 269–297.
[13] Crainiceanu, C., Ruppert, D. and Wand, M. P. (2005). Bayesian analysis for penalized spline regression using WinBUGS. J. Statistical Software 14 (14).
[14] Diggle, P., Liang, K.-L. and Zeger, S. (1994). Analysis of Longitudinal Data . Oxford Univ. Press. · Zbl 1031.62002
[15] Diggle, P. J., Tawn, J. A. and Moyeed, R. A. (1998). Model-based geostatistics (with discussion). Appl. Statist. 47 299–350. JSTOR: · Zbl 0904.62119
[16] Durbán, M. and Currie, I. (2003). A note on P-spline additive models with correlated errors. Comput. Statist. 18 251–262. · Zbl 1050.62042
[17] Fahrmeir, L. and Lang, S. (2001). Bayesian inference for generalized additive mixed models based on Markov random field priors. Appl. Statist. 50 201–220. JSTOR: · Zbl 04565472
[18] French, J. L., Kammann, E. E. and Wand, M. P. (2001). Comment on “Semiparametric nonlinear mixed-effects models and their applications,” by C. Ke and Y. Wang. J. Amer. Statist. Assoc. 96 1285–1288.
[19] French, J. L. and Wand, M. P. (2004). Generalized additive models for cancer mapping with incomplete covariates. Biostatistics 5 177–191. · Zbl 1096.62116
[20] Gelfand, A. E., Sahu, S. K. and Carlin, B. P. (1995). Efficient parameterisations for normal linear mixed models. Biometrika 82 479–488. JSTOR: · Zbl 0832.62064
[21] Gelman, A. (2005). Prior distribution for variance parameters in hierarchical models. Bayesian Analysis .
[22] Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statist. Sci. 7 457–472, 483–501, 503–511. · Zbl 1386.65060
[23] Gilks, W. R., Richardson, S. and Spiegelhalter, D. J., eds. (1996). Markov Chain Monte Carlo in Practice. Chapman and Hall, London. · Zbl 0832.00018
[24] Gilks, W. R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Appl. Statist. 41 337–348. · Zbl 0825.62407
[25] Gilmour, A. R., Anderson, R. D. and Rae, A. L. (1985). The analysis of binomial data by a generalized linear mixed model. Biometrika 72 593–599. JSTOR:
[26] Gold, D. R., Burge, H. A., Carey, V., Milton, D. K., Platts-Mills, T. and Weiss, S. T. (1999). Predictors of repeated wheeze in the first year of life: The relative roles of cockroach, birth weight, acute lower respiratory illness, and maternal smoking. Amer. J. Respiratory and Critical Care Medicine 160 227–236.
[27] Goldstein, H. (1995). Multilevel Statistical Models , 2nd ed. Edward Arnold, London. · Zbl 1014.62126
[28] Handcock, M. S. and Stein, M. L. (1993). A Bayesian analysis of kriging. Technometrics 35 403–410.
[29] Hobert, J. P. and Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J. Amer. Statist. Assoc. 91 1461–1473. JSTOR: · Zbl 0882.62020
[30] Kammann, E. E. and Wand, M. P. (2003). Geoadditive models. Appl. Statist. 52 1–18. JSTOR: · Zbl 1111.62346
[31] Kelsey, J., Whittemore, A., Evans, A. and Thompson, W. D. (1996). Methods in Observational Epidemiology . Oxford Univ. Press.
[32] Kreft, I. and de Leeuw, J. (1998). Introducing Multilevel Modeling . Sage, London.
[33] Lin, X. and Breslow, N. E. (1996). Bias correction in generalized linear mixed models with multiple components of dispersion. J. Amer. Statist. Assoc. 91 1007–1016. JSTOR: · Zbl 0882.62059
[34] Lin, X. and Carroll, R. J. (2001). Semiparametric regression for clustered data. Biometrika 88 1179–1185. JSTOR: · Zbl 0994.62031
[35] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models , 2nd ed. Chapman and Hall, London. · Zbl 0588.62104
[36] McCulloch, C. E. and Searle, S. R. (2001). Generalized , Linear and Mixed Models . Wiley, New York. · Zbl 0964.62061
[37] Natarajan, R. and Kass, R. E. (2000). Reference Bayesian methods for generalized linear mixed models. J. Amer. Statist. Assoc. 95 227–237. JSTOR: · Zbl 0996.62021
[38] Natarajan, R. and McCulloch, C. E. (1998). Gibbs sampling with diffuse proper priors: A valid approach to data-driven inference? J. Comput. Graph. Statist. 7 267–277.
[39] Neal, R. M. (2003). Slice sampling (with discussion). Ann. Statist. 31 705–767. · Zbl 1051.65007
[40] Nychka, D. and Saltzman, N. (1998). Design of air quality monitoring networks. Case Studies in Environmental Statistics . Lecture Notes in Statist. 132 51–76. Springer, Berlin. · Zbl 0899.00022
[41] Robinson, G. K. (1991). That BLUP is a good thing: The estimation of random effects (with discussion). Statist. Sci. 6 15–51. · Zbl 0955.62500
[42] Ruppert, D. (2002). Selecting the number of knots for penalized splines. J. Comput. Graph. Statist. 11 735–757. JSTOR:
[43] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression . Cambridge Univ. Press. · Zbl 1038.62042
[44] Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika 78 719–727. · Zbl 0850.62561
[45] Shun, Z. (1997). Another look at the salamander mating data: A modified Laplace approximation approach. J. Amer. Statist. Assoc. 92 341–349. · Zbl 1090.62515
[46] Speed, T. (1991). Comment on “That BLUP is a good thing: The estimation of random effects,” by G. K. Robinson. Statist. Sci. 6 42–44. · Zbl 0955.62500
[47] Spiegelhalter, D. J., Thomas, A. and Best, N. G. (2000). WinBUGS Version 1.3 User Manual. Available at www.mrc-bsu.cam.ac.uk/bugs.
[48] Spiegelhalter, D. J., Thomas, A., Best, N. G., Gilks, W. R. and Lunn, D. (2003). BUGS: Bayesian inference using Gibbs sampling. MRC Biostatistics Unit, Cambridge, England. Available at www.mrc-bsu.cam.ac.uk/bugs.
[49] Stein, M. L. (1999). Interpolation of Spatial Data : Some Theory for Kriging . Springer, New York. · Zbl 0924.62100
[50] Stiratelli, R., Laird, N. M. and Ware, J. H. (1984). Random effects models for serial observations with binary response. Biometrics 40 961–971.
[51] Verbyla, A. P. (1994). Testing linearity in generalized linear models. In Proc. 17th International Biometric Conference , Hamilton , Ontario 2 177.
[52] Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia. · Zbl 0813.62001
[53] Wakefield, J. C., Best, N. G. and Waller, L. (2001). Bayesian approaches to disease mapping. In Spatial Epidemiology : Methods and Applications (P. Elliott, J. C. Wakefield, N. G. Best and D. J. Briggs, eds.) 104–127. Oxford Univ. Press.
[54] Wand, M. P. (2003). Smoothing and mixed models. Comput. Statist. 18 223–249. · Zbl 1050.62049
[55] Wolfinger, R. and O’Connell, M. (1993). Generalized linear mixed models: A pseudo-likelihood approach. J. Statist. Comput. Simulation 48 233–243. · Zbl 0833.62067
[56] Wright, R. J., Finn, P., Contreras, J. P., Cohen, S., Wright, R. O., Staudenmayer, J., Wand, M. P., Perkins, D., Weiss, S. T. and Gold, D. R. (2004). Chronic caregiver stress and IgE expression, allergen-induced proliferation, and cytokine profiles in a birth cohort predisposed to atopy. J. Allergy and Clinical Immunology 113 1051–1057.
[57] Zeger, S. L. and Karim, M. R. (1991). Generalized linear models with random effects: A Gibbs sampling approach. J. Amer. Statist. Assoc. 86 79–86. JSTOR:
[58] Zhao, Y. (2003). General design Bayesian generalized linear mixed models with applications to spatial statistics. Ph.D. dissertation, Dept. Biostatistics, Harvard Univ.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.