# zbMATH — the first resource for mathematics

A family of generalized linear models for repeated measures with normal and conjugate random effects. (English) Zbl 1329.62342
Summary: Non-Gaussian outcomes are often modeled using members of the so-called exponential family. Notorious members are the Bernoulli model for binary data, leading to logistic regression, and the Poisson model for count data, leading to Poisson regression. Two of the main reasons for extending this family are (1) the occurrence of overdispersion, meaning that the variability in the data is not adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of hierarchical structure in the data, stemming from clustering in the data which, in turn, may result from repeatedly measuring the outcome, for various members of the same family, etc. The first issue is dealt with through a variety of overdispersion models, such as, for example, the beta-binomial model for grouped binary data and the negative-binomial model for counts. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these phenomena may occur simultaneously, models combining them are uncommon. This paper proposes a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. We place particular emphasis on so-called conjugate random effects at the level of the mean for the first aspect and normal random effects embedded within the linear predictor for the second aspect, even though our family is more general. The binary, count and time-to-event cases are given particular emphasis. Apart from model formulation, we present an overview of estimation methods, and then settle for maximum likelihood estimation with analytic-numerical integration. Implications for the derivation of marginal correlations functions are discussed. The methodology is applied to data from a study in epileptic seizures, a clinical trial in toenail infection named onychomycosis and survival data in children with asthma.

##### MSC:
 62J12 Generalized linear models (logistic models) 62P10 Applications of statistics to biology and medical sciences; meta analysis
Fahrmeir
Full Text:
##### References:
  Aerts, M., Geys, H., Molenberghs, G. and Ryan, L. (2002). Topics in Modelling of Clustered Data . Chapman & Hall, London. · Zbl 1084.62513  Agresti, A. (2002). Categorical Data Analysis , 2nd ed. Wiley, New York. · Zbl 1018.62002  Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55 117-128. JSTOR: · Zbl 1059.62564 · doi:10.1111/j.0006-341X.1999.00117.x · links.jstor.org  Alfò, M. and Aitkin, M. (2000). Random coefficient models for binary longitudinal responses with attrition. Statist. Comput. 10 279-288.  Ashford, J. R. and Sowden, R. R. (1970). Multivariate probit analysis. Biometrics 26 535-546.  Bahadur, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In Studies in Item Analysis and Prediction (H. Solomon, ed.) 158-168. Stanford Univ. Press, Stanford, CA. · Zbl 0103.36701  Böhning, D. (2000). Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and Others . Chapman & Hall/CRC, London. · Zbl 0951.62088  Booth, J. G., Casella, G., Friedl, H. and Hobert, J. P. (2003). Negative binomial loglinear mixed models. Stat. Model. 3 179-181. · Zbl 1070.62058 · doi:10.1191/1471082X03st058oa  Breslow, N. (1984). Extra-Poisson variation in log-linear models. Appl. Statist. 33 38-44.  Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9-25. · Zbl 0775.62195 · doi:10.2307/2290687  Breslow, N. E. and Lin, X. (1995). Bias correction in generalized linear mixed models with a single component of dispersion. Biometrika 82 81-91. JSTOR: · Zbl 0823.62059 · doi:10.1093/biomet/82.1.81 · links.jstor.org  Burzykowski, T., Molenberghs, G. and Buyse, M. (2005). The Evaluation of Surrogate Endpoints . Springer, New York. · Zbl 1078.62113 · doi:10.1007/b138566  Butler, J. S. and Moffit, R. (1982). A computationally efficient quadrature procedure for the one-factor multinomial probit model. Econometrica 50 761-765. · Zbl 0483.62099 · doi:10.2307/1912613  Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics . Chapman & Hall/CRC, London. · Zbl 0334.62003  Dale, J. R. (1986). Global cross-ratio models for bivariate, discrete, ordered responses. Biometrics 42 721-727.  Dean, C. B. (1991). Estimating equations for mixed-Poisson models. In Estimating Functions (V. P. Godambe, ed.) 35-46. Oxford Univ. Press, Oxford. · Zbl 0850.62273  De Backer, M., De Keyser, P., De Vroey, C. and Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: Terbinafine 250 mg/day vs. itraconazole 200 mg/day-a double-blind comparative trial. British J. Dermatol. 134 16-17.  Duchateau, L. and Janssen, P. (2007). The Frailty Model . Springer, New York. · Zbl 1210.62153 · doi:10.1007/978-0-387-72835-3  Engel, B. and Keen, A. (1994). A simple approach for the analysis of generalized linear mixed models. Statist. Neerlandica 48 1-22. · Zbl 0826.62055 · doi:10.1111/j.1467-9574.1994.tb01428.x  Fahrmeir, L. and Tutz, G. (2001). Multivariate Statistical Modelling Based on Generalized Linear Models , 2nd ed. Springer, New York. · Zbl 0980.62052  Faught, E., Wilder, B. J., Ramsay, R. E., Reife, R. A., Kramer, L. D., Pledger, G. W. and Karim, R. M. (1996). Topiramate placebo-controlled dose-ranging trial in refractory partial epilepsy using 200-, 400-, and 600-mg daily dosages. Neurology 46 1684-1690.  Fitzmaurice, G., Davidian, M., Molenberghs, G. and Verbeke, G. (2009). Longitudinal Data Analysis. Handbooks of Modern Statistical Methods . Chapman & Hall/CRC, New York. · Zbl 1144.62087 · doi:10.1201/9781420011579  Gentle, J. E. (2003). Random Number Generation and Monte Carlo Methods . Springer, New York. · Zbl 1028.65004  Gibbons, R. D. and Hedeker, D. (1997). Random effects probit and logistic regression models for three-level data. Biometrics 53 1527-1537. · Zbl 0959.62106 · doi:10.2307/2533520  Guilkey, D. K. and Murphy, J. L. (1993). Estimation and testing in the random effects probit model. J. Econometrics 59 301-317.  Harville, D. A. (1974). Bayesian inference for variance components using only error contrasts. Biometrika 61 383-385. JSTOR: · Zbl 0281.62072 · doi:10.1093/biomet/61.2.383 · links.jstor.org  Hedeker, D. and Gibbons, R. D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics 51 933-944. · Zbl 0826.62049 · doi:10.2307/2533433  Henderson, C. R. (1984). Applications of Linear Models in Animal Breeding . University of Guelph Press, Guelph, Canada.  Hinde, J. and Demétrio, C. G. B. (1998a). Overdispersion: Models and estimation. Comput. Statist. Data Anal. 27 151-170. · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3  Hinde, J. and Demétrio, C. G. B. (1998b). Overdispersion: Models and Estimation . XIII Sinape, São Paulo. · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3  Johnson, N. L., Kemp, A. and Kotz, S. (2005). Univariate Discrete Distributions , 3rd ed. Wiley, Hoboken.  Johnson, N. L. and Kotz, S. (1970). Distributions in Statistics, Continuous Univariate Distributions, Vol. 2 . Houghton-Mifflin, Boston. · Zbl 0213.21101  Kleinman, J. (1973). Proportions with extraneous variance: Single and independent samples. J. Amer. Statist. Assoc. 68 46-54.  Lawless, J. (1987). Negative binomial and mixed Poisson regression. Canadian J. Statist. 15 209-225. JSTOR: · Zbl 0632.62060 · doi:10.2307/3314912 · links.jstor.org  Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models (with discussion). J. Roy. Statist. Soc. Ser. B 58 619-678. JSTOR: · Zbl 0880.62076 · links.jstor.org  Lee, Y. and Nelder, J. A. (2001a). Two ways of modelling overdispersion. Appl. Statist. 49 591-598.  Lee, Y. and Nelder, J. A. (2001b). Hierarchical generalized linear models: A synthesis of generalized linear models, random-effect models and structured dispersions. Biometrika 88 987-1006. JSTOR: · Zbl 0995.62066 · doi:10.1093/biomet/88.4.987 · links.jstor.org  Lee, Y. and Nelder, J. A. (2003). Extended-REML estimators. J. Appl. Statist. 30 845-856. · Zbl 1121.62424 · doi:10.1080/0266476032000075930  Lee, Y., Nelder, J. A. and Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-Likelihood . Chapman & Hall/CRC, Boca Raton, FL. · Zbl 1110.62092 · doi:10.1201/9781420011340  Lesaffre, E. and Molenberghs, G. (1991). Multivariate probit analysis: A neglected procedure in medical statistics. Statist. Med. 10 1391-1403.  Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13-22. JSTOR: · Zbl 0595.62110 · doi:10.1093/biomet/73.1.13 · links.jstor.org  Lin, T. I. and Lee, J. C. (2008). Estimation and prediction in linear mixed models with skew-normal random effects for longitudinal data. Statist. Med. 27 1490-1507. · doi:10.1002/sim.3026  Liu, L. and Yu, Z. (2008). A likelihood reformulation method in non-normal random-effects models. Statist. Med. 27 3105-3124. · doi:10.1002/sim.3153  McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models . Chapman & Hall/CRC, London. · Zbl 0744.62098  McCulloch, C. E. (1994). Maximum likelihood variance components estimation for binary data. J. Amer. Statist. Assoc. 89 330-335. · Zbl 0800.62139 · doi:10.2307/2291229  McLachlan, G. and Peel, D. A. (2000). Finite Mixture Models . Wiley, New York. · Zbl 0963.62061  Molenberghs, G. and Lesaffre, E. (1994). Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. J. Amer. Statist. Assoc. 89 633-644. · Zbl 0802.62063 · doi:10.2307/2290866  Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal Data . Springer, New York. · Zbl 1093.62002 · doi:10.1007/0-387-28980-1  Molenberghs, G. and Verbeke, G. (2007). Likelihood ratio, score, and Wald tests in a constrained parameter space. Amer. Statist. 61 1-6. · Zbl 05680712 · doi:10.1198/000313007X171322  Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal. 13 513-531. · Zbl 1331.62363 · doi:10.1007/s10985-007-9064-y  Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370-384.  Nelson, K. P., Lipsitz, S. R., Fitzmaurice, G. M., Ibrahim, J., Parzen, M. and Strawderman, R. (2006). Use of the probability integral transformation to fit nonlinear mixed-effects models with non-normal random effects. J. Comput. Graph. Statist. 15 39-57. · doi:10.1198/106186006X96854  Renard, D., Molenberghs, G. and Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models. Comput. Statist. Data Anal. 44 649-667. · Zbl 1429.62324  Roberts, D. T. (1992). Prevalence of dermatophyte onychomycosis in the United Kingdom: Results of an omnibus survey. British J. Dermatol. 126 (Suppl. 39) 23-27.  Ridout, M., Demétrio, C. G. B. and Hinde, J. (1998). Models for count data with many zeros. In International Biometric Conference XIX 179-192. Cape Town. Invited papers.  Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika 78 719-729. · Zbl 0850.62561 · doi:10.1093/biomet/78.4.719  Skellam, J. G. (1948). A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. J. Roy. Statist. Soc. Ser. B 10 257-261. JSTOR: · Zbl 0032.41903 · links.jstor.org  Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling . Chapman & Hall/CRC, London. · Zbl 1097.62001 · www.crcnetbase.com  Thall, P. F. and Vail, S. C. (1990). Some covariance models for longitudinal count data with overdispersion. Biometrics 46 657-671. JSTOR: · Zbl 0712.62048 · doi:10.2307/2532086 · links.jstor.org  Vangeneugden, T., Molenberghs, G., Laenen, A., Alonso, A. and Geys, H. (2008a). Generalizability in non-Gaussian longitudinal clinical trial data based on generalized linear mixed models. J. Biopharm. Statist. 18 691-712. · doi:10.1080/10543400802071386  Vangeneugden, T., Molenberghs, G., Verbeke, G. and Demétrio, C. (2010). Marginal correlation from an extended random-effects model for repeated and overdispersed counts. Comm. Statist. Theory Methods . · Zbl 1309.62017  Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data . Springer, New York. · Zbl 0956.62055 · doi:10.1007/b98969  Verbeke, G. and Molenberghs, G. (2009). Arbitrariness of models for augmented and coarse data, with emphasis on incomplete-data and random-effects models. Statist. Model. 00 000-000.  Wolfinger, R. and O’Connell, M. (1993). Generalized linear mixed models: A pseudo-likelihood approach. J. Statist. Comput. Simul. 48 233-243. · Zbl 0833.62067 · doi:10.1080/00949659308811554  Yun, S., Sohn, S. Y. and Lee, Y. (2006). Modelling and estimating heavy-tailed non-homogeneous correlated queues Pareto-inverse gamma HGLMs with covariates. J. Appl. Statist. 33 417-425. · Zbl 1118.62394 · doi:10.1080/02664760500449311  Zeger, S. L., Liang, K.-Y. and Albert, P. S. (1988). Models for longitudinal data: A generalized estimating equation approach. Biometrics 44 1049-1060. JSTOR: · Zbl 0715.62136 · doi:10.2307/2531734 · links.jstor.org
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.