zbMATH — the first resource for mathematics

A family of generalized linear models for repeated measures with normal and conjugate random effects. (English) Zbl 1329.62342
Summary: Non-Gaussian outcomes are often modeled using members of the so-called exponential family. Notorious members are the Bernoulli model for binary data, leading to logistic regression, and the Poisson model for count data, leading to Poisson regression. Two of the main reasons for extending this family are (1) the occurrence of overdispersion, meaning that the variability in the data is not adequately described by the models, which often exhibit a prescribed mean-variance link, and (2) the accommodation of hierarchical structure in the data, stemming from clustering in the data which, in turn, may result from repeatedly measuring the outcome, for various members of the same family, etc. The first issue is dealt with through a variety of overdispersion models, such as, for example, the beta-binomial model for grouped binary data and the negative-binomial model for counts. Clustering is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these phenomena may occur simultaneously, models combining them are uncommon. This paper proposes a broad class of generalized linear models accommodating overdispersion and clustering through two separate sets of random effects. We place particular emphasis on so-called conjugate random effects at the level of the mean for the first aspect and normal random effects embedded within the linear predictor for the second aspect, even though our family is more general. The binary, count and time-to-event cases are given particular emphasis. Apart from model formulation, we present an overview of estimation methods, and then settle for maximum likelihood estimation with analytic-numerical integration. Implications for the derivation of marginal correlations functions are discussed. The methodology is applied to data from a study in epileptic seizures, a clinical trial in toenail infection named onychomycosis and survival data in children with asthma.

62J12 Generalized linear models (logistic models)
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI Euclid
[1] Aerts, M., Geys, H., Molenberghs, G. and Ryan, L. (2002). Topics in Modelling of Clustered Data . Chapman & Hall, London. · Zbl 1084.62513
[2] Agresti, A. (2002). Categorical Data Analysis , 2nd ed. Wiley, New York. · Zbl 1018.62002
[3] Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55 117-128. JSTOR: · Zbl 1059.62564 · doi:10.1111/j.0006-341X.1999.00117.x · links.jstor.org
[4] Alfò, M. and Aitkin, M. (2000). Random coefficient models for binary longitudinal responses with attrition. Statist. Comput. 10 279-288.
[5] Ashford, J. R. and Sowden, R. R. (1970). Multivariate probit analysis. Biometrics 26 535-546.
[6] Bahadur, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In Studies in Item Analysis and Prediction (H. Solomon, ed.) 158-168. Stanford Univ. Press, Stanford, CA. · Zbl 0103.36701
[7] Böhning, D. (2000). Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and Others . Chapman & Hall/CRC, London. · Zbl 0951.62088
[8] Booth, J. G., Casella, G., Friedl, H. and Hobert, J. P. (2003). Negative binomial loglinear mixed models. Stat. Model. 3 179-181. · Zbl 1070.62058 · doi:10.1191/1471082X03st058oa
[9] Breslow, N. (1984). Extra-Poisson variation in log-linear models. Appl. Statist. 33 38-44.
[10] Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9-25. · Zbl 0775.62195 · doi:10.2307/2290687
[11] Breslow, N. E. and Lin, X. (1995). Bias correction in generalized linear mixed models with a single component of dispersion. Biometrika 82 81-91. JSTOR: · Zbl 0823.62059 · doi:10.1093/biomet/82.1.81 · links.jstor.org
[12] Burzykowski, T., Molenberghs, G. and Buyse, M. (2005). The Evaluation of Surrogate Endpoints . Springer, New York. · Zbl 1078.62113 · doi:10.1007/b138566
[13] Butler, J. S. and Moffit, R. (1982). A computationally efficient quadrature procedure for the one-factor multinomial probit model. Econometrica 50 761-765. · Zbl 0483.62099 · doi:10.2307/1912613
[14] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics . Chapman & Hall/CRC, London. · Zbl 0334.62003
[15] Dale, J. R. (1986). Global cross-ratio models for bivariate, discrete, ordered responses. Biometrics 42 721-727.
[16] Dean, C. B. (1991). Estimating equations for mixed-Poisson models. In Estimating Functions (V. P. Godambe, ed.) 35-46. Oxford Univ. Press, Oxford. · Zbl 0850.62273
[17] De Backer, M., De Keyser, P., De Vroey, C. and Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: Terbinafine 250 mg/day vs. itraconazole 200 mg/day-a double-blind comparative trial. British J. Dermatol. 134 16-17.
[18] Duchateau, L. and Janssen, P. (2007). The Frailty Model . Springer, New York. · Zbl 1210.62153 · doi:10.1007/978-0-387-72835-3
[19] Engel, B. and Keen, A. (1994). A simple approach for the analysis of generalized linear mixed models. Statist. Neerlandica 48 1-22. · Zbl 0826.62055 · doi:10.1111/j.1467-9574.1994.tb01428.x
[20] Fahrmeir, L. and Tutz, G. (2001). Multivariate Statistical Modelling Based on Generalized Linear Models , 2nd ed. Springer, New York. · Zbl 0980.62052
[21] Faught, E., Wilder, B. J., Ramsay, R. E., Reife, R. A., Kramer, L. D., Pledger, G. W. and Karim, R. M. (1996). Topiramate placebo-controlled dose-ranging trial in refractory partial epilepsy using 200-, 400-, and 600-mg daily dosages. Neurology 46 1684-1690.
[22] Fitzmaurice, G., Davidian, M., Molenberghs, G. and Verbeke, G. (2009). Longitudinal Data Analysis. Handbooks of Modern Statistical Methods . Chapman & Hall/CRC, New York. · Zbl 1144.62087 · doi:10.1201/9781420011579
[23] Gentle, J. E. (2003). Random Number Generation and Monte Carlo Methods . Springer, New York. · Zbl 1028.65004
[24] Gibbons, R. D. and Hedeker, D. (1997). Random effects probit and logistic regression models for three-level data. Biometrics 53 1527-1537. · Zbl 0959.62106 · doi:10.2307/2533520
[25] Guilkey, D. K. and Murphy, J. L. (1993). Estimation and testing in the random effects probit model. J. Econometrics 59 301-317.
[26] Harville, D. A. (1974). Bayesian inference for variance components using only error contrasts. Biometrika 61 383-385. JSTOR: · Zbl 0281.62072 · doi:10.1093/biomet/61.2.383 · links.jstor.org
[27] Hedeker, D. and Gibbons, R. D. (1994). A random-effects ordinal regression model for multilevel analysis. Biometrics 51 933-944. · Zbl 0826.62049 · doi:10.2307/2533433
[28] Henderson, C. R. (1984). Applications of Linear Models in Animal Breeding . University of Guelph Press, Guelph, Canada.
[29] Hinde, J. and Demétrio, C. G. B. (1998a). Overdispersion: Models and estimation. Comput. Statist. Data Anal. 27 151-170. · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3
[30] Hinde, J. and Demétrio, C. G. B. (1998b). Overdispersion: Models and Estimation . XIII Sinape, São Paulo. · Zbl 1042.62578 · doi:10.1016/S0167-9473(98)00007-3
[31] Johnson, N. L., Kemp, A. and Kotz, S. (2005). Univariate Discrete Distributions , 3rd ed. Wiley, Hoboken.
[32] Johnson, N. L. and Kotz, S. (1970). Distributions in Statistics, Continuous Univariate Distributions, Vol. 2 . Houghton-Mifflin, Boston. · Zbl 0213.21101
[33] Kleinman, J. (1973). Proportions with extraneous variance: Single and independent samples. J. Amer. Statist. Assoc. 68 46-54.
[34] Lawless, J. (1987). Negative binomial and mixed Poisson regression. Canadian J. Statist. 15 209-225. JSTOR: · Zbl 0632.62060 · doi:10.2307/3314912 · links.jstor.org
[35] Lee, Y. and Nelder, J. A. (1996). Hierarchical generalized linear models (with discussion). J. Roy. Statist. Soc. Ser. B 58 619-678. JSTOR: · Zbl 0880.62076 · links.jstor.org
[36] Lee, Y. and Nelder, J. A. (2001a). Two ways of modelling overdispersion. Appl. Statist. 49 591-598.
[37] Lee, Y. and Nelder, J. A. (2001b). Hierarchical generalized linear models: A synthesis of generalized linear models, random-effect models and structured dispersions. Biometrika 88 987-1006. JSTOR: · Zbl 0995.62066 · doi:10.1093/biomet/88.4.987 · links.jstor.org
[38] Lee, Y. and Nelder, J. A. (2003). Extended-REML estimators. J. Appl. Statist. 30 845-856. · Zbl 1121.62424 · doi:10.1080/0266476032000075930
[39] Lee, Y., Nelder, J. A. and Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-Likelihood . Chapman & Hall/CRC, Boca Raton, FL. · Zbl 1110.62092 · doi:10.1201/9781420011340
[40] Lesaffre, E. and Molenberghs, G. (1991). Multivariate probit analysis: A neglected procedure in medical statistics. Statist. Med. 10 1391-1403.
[41] Liang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13-22. JSTOR: · Zbl 0595.62110 · doi:10.1093/biomet/73.1.13 · links.jstor.org
[42] Lin, T. I. and Lee, J. C. (2008). Estimation and prediction in linear mixed models with skew-normal random effects for longitudinal data. Statist. Med. 27 1490-1507. · doi:10.1002/sim.3026
[43] Liu, L. and Yu, Z. (2008). A likelihood reformulation method in non-normal random-effects models. Statist. Med. 27 3105-3124. · doi:10.1002/sim.3153
[44] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models . Chapman & Hall/CRC, London. · Zbl 0744.62098
[45] McCulloch, C. E. (1994). Maximum likelihood variance components estimation for binary data. J. Amer. Statist. Assoc. 89 330-335. · Zbl 0800.62139 · doi:10.2307/2291229
[46] McLachlan, G. and Peel, D. A. (2000). Finite Mixture Models . Wiley, New York. · Zbl 0963.62061
[47] Molenberghs, G. and Lesaffre, E. (1994). Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. J. Amer. Statist. Assoc. 89 633-644. · Zbl 0802.62063 · doi:10.2307/2290866
[48] Molenberghs, G. and Verbeke, G. (2005). Models for Discrete Longitudinal Data . Springer, New York. · Zbl 1093.62002 · doi:10.1007/0-387-28980-1
[49] Molenberghs, G. and Verbeke, G. (2007). Likelihood ratio, score, and Wald tests in a constrained parameter space. Amer. Statist. 61 1-6. · Zbl 05680712 · doi:10.1198/000313007X171322
[50] Molenberghs, G., Verbeke, G. and Demétrio, C. (2007). An extended random-effects approach to modeling repeated, overdispersed count data. Lifetime Data Anal. 13 513-531. · Zbl 1331.62363 · doi:10.1007/s10985-007-9064-y
[51] Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370-384.
[52] Nelson, K. P., Lipsitz, S. R., Fitzmaurice, G. M., Ibrahim, J., Parzen, M. and Strawderman, R. (2006). Use of the probability integral transformation to fit nonlinear mixed-effects models with non-normal random effects. J. Comput. Graph. Statist. 15 39-57. · doi:10.1198/106186006X96854
[53] Renard, D., Molenberghs, G. and Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models. Comput. Statist. Data Anal. 44 649-667. · Zbl 1429.62324
[54] Roberts, D. T. (1992). Prevalence of dermatophyte onychomycosis in the United Kingdom: Results of an omnibus survey. British J. Dermatol. 126 (Suppl. 39) 23-27.
[55] Ridout, M., Demétrio, C. G. B. and Hinde, J. (1998). Models for count data with many zeros. In International Biometric Conference XIX 179-192. Cape Town. Invited papers.
[56] Schall, R. (1991). Estimation in generalized linear models with random effects. Biometrika 78 719-729. · Zbl 0850.62561 · doi:10.1093/biomet/78.4.719
[57] Skellam, J. G. (1948). A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. J. Roy. Statist. Soc. Ser. B 10 257-261. JSTOR: · Zbl 0032.41903 · links.jstor.org
[58] Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling . Chapman & Hall/CRC, London. · Zbl 1097.62001 · www.crcnetbase.com
[59] Thall, P. F. and Vail, S. C. (1990). Some covariance models for longitudinal count data with overdispersion. Biometrics 46 657-671. JSTOR: · Zbl 0712.62048 · doi:10.2307/2532086 · links.jstor.org
[60] Vangeneugden, T., Molenberghs, G., Laenen, A., Alonso, A. and Geys, H. (2008a). Generalizability in non-Gaussian longitudinal clinical trial data based on generalized linear mixed models. J. Biopharm. Statist. 18 691-712. · doi:10.1080/10543400802071386
[61] Vangeneugden, T., Molenberghs, G., Verbeke, G. and Demétrio, C. (2010). Marginal correlation from an extended random-effects model for repeated and overdispersed counts. Comm. Statist. Theory Methods . · Zbl 1309.62017
[62] Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data . Springer, New York. · Zbl 0956.62055 · doi:10.1007/b98969
[63] Verbeke, G. and Molenberghs, G. (2009). Arbitrariness of models for augmented and coarse data, with emphasis on incomplete-data and random-effects models. Statist. Model. 00 000-000.
[64] Wolfinger, R. and O’Connell, M. (1993). Generalized linear mixed models: A pseudo-likelihood approach. J. Statist. Comput. Simul. 48 233-243. · Zbl 0833.62067 · doi:10.1080/00949659308811554
[65] Yun, S., Sohn, S. Y. and Lee, Y. (2006). Modelling and estimating heavy-tailed non-homogeneous correlated queues Pareto-inverse gamma HGLMs with covariates. J. Appl. Statist. 33 417-425. · Zbl 1118.62394 · doi:10.1080/02664760500449311
[66] Zeger, S. L., Liang, K.-Y. and Albert, P. S. (1988). Models for longitudinal data: A generalized estimating equation approach. Biometrics 44 1049-1060. JSTOR: · Zbl 0715.62136 · doi:10.2307/2531734 · links.jstor.org
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.