## Estimating negative variance components from Gaussian and non-Gaussian data: a mixed models approach.(English)Zbl 1284.62060

Summary: The occurrence of negative variance components is a reasonably well understood phenomenon in the case of linear models for hierarchical data, such as variance-component models in designed experiments or linear mixed models for longitudinal data. In many cases, such negative variance components can be translated as negative within-unit correlations. It is shown that negative variance components, with corresponding negative associations, can occur in hierarchical models for non-Gaussian outcomes as well, such as repeated binary data or counts. While this feature poses no problem for marginal models, in which the mean and correlation functions are modeled directly and separately, the issue is more complicated in, for example, generalized linear mixed models. This owes in part to the non-linear nature of the link function, non-constant residual variance stemming from the mean-variance link, and the resulting lack of closed-form expressions for the marginal correlations. It is established that such negative variance components in generalized linear mixed models can occur in practice and that they can be estimated using standard statistical software. Marginal-correlation functions are derived. Important implications for interpretation and model choice are discussed. Simulations and the analysis of data from a developmental toxicity experiment underscore these results.

### MSC:

 62-07 Data analysis (statistics) (MSC2010) 62H20 Measures of association (correlation, canonical correlation, etc.) 62J12 Generalized linear models (logistic models)
Full Text:

### References:

 [1] Alonso, A.; Litière, S.; Molenberghs, G., A family of tests to detect misspecifications in the random effects distribution of generalized linear mixed models, Computational statistics and data analysis, 52, 4474-4486, (2008) · Zbl 1452.62532 [2] Breslow, N.E.; Clayton, D.G., Approximate inference in generalized linear mixed models, Journal of the American statistical association, 88, 9-25, (1993) · Zbl 0775.62195 [3] Bahadur, R.R., A representation of the joint distribution of responses to $$n$$ dichotomous items, () · Zbl 0103.36701 [4] Britton, T., Tests to detect clustering of infected individuals within families, Biometrics, 53, 98-109, (1997) · Zbl 0891.62077 [5] Cortiñas, Abrahantes J.; Molenberghs, G.; Burzykowski, T.; Shkedy, Z.; Renard, D., Choice of units of analysis and modeling strategies in multilevel hierarchical models, Computational statistics and data analysis, 47, 537-563, (2004) · Zbl 1429.62482 [6] Chernoff, H., On the distribution of the likelihood ratio, Annals of mathematical statistics, 25, 573-578, (1954) · Zbl 0056.37102 [7] Faes, C.; Geys, H.; Aerts, M.; Molenberghs, G., A hierarchical modeling approach for risk assessment in developmental toxicity studies, Computational statistics and data analysis, 51, 1848-1861, (2006) · Zbl 1157.62533 [8] Haber, F., Zur geschichte des gaskrieges (on the history of gas warfare), (), 76-92 [9] Jacqmin-Gadda, H.; Commenges, D., Tests of homogeneity for generalised linear models, Journal of the American statistical association, 90, 1237-1246, (1995) · Zbl 0868.62061 [10] Kenny, D.A.; Judd, C.M., Consequences of violating the independence assumption in analysis of variance, Psychological bulletin, 99, 422-431, (1986) [11] Kenny, D.A.; Mannetti, L.; Pierro, A.; Livi, S.; Kashy, D.A., The statistical analysis of data from small groups, Journal of personality and social psychology, 83, 126-137, (2002) [12] Kimmel, G.L.; Williams, P.L.; Kimmel, C.A.; Claggett, T.W.; Tudor, N., The effects of temperature and duration of exposure on in vitro development and response-surface modelling of their interaction, Teratology, 49, 366-367, (1994) [13] Kleinman, J., Proportions with extraneous variance: single and independent samples, Journal of the American statistical association, 68, 46-54, (1973) [14] Laird, N.M.; Ware, J.H., Random effects model for longitudinal data, Biometrics, 38, 963-974, (1982) · Zbl 0512.62107 [15] Leisch, F., Weingessel, A., Hornik, K., 1998. On the Generation of Correlated Artificial Binary Data. Technical Report, Technische Universität Wien, Vienna, Austria. [16] Leisch, F.; Weingessel, A.; Hornik, K., On the generation of correlated artificial binary data, Journal of statistical planning and inference, 20, 131-154, (1988) [17] Molenberghs, G.; Declerck, L.; Aerts, M., Misspecifying the likelihood for clustered binary data, Computational statistics and data analysis, 26, 327-349, (1998) · Zbl 1042.62620 [18] Molenberghs, G., Verbeke, G., 2010. A note on a hierarchical interpretation for negative variance components. Statistical Modeling (in press). [19] Molenberghs, G.; Verbeke, G., Likelihood ratio, score, and Wald tests in a constrained parameter space, The American Statistician, 61, 1-6, (2007) [20] Molenberghs, G.; Verbeke, G., Models for repeated discrete data, (2005), Springer-Verlag New York [21] Molenberghs, G.; Verbeke, G.; Demétrio, C., An extended random-effects approach to modeling repeated, overdispersed count data, Lifetime data analysis, 13, 513-531, (2007) · Zbl 1331.62363 [22] Nelder, J.A., The interpretation of negative components of variance, Biometrika, 41, 544-548, (1954) · Zbl 0056.12703 [23] Pinheiro, J.C., Conditional versus marginal covariance representation for linear and nonlinear models, Austrian journal of statistics, 35, 31-44, (2006) [24] Prentice, R.L., Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors, Journal of the American statistical association, 81, 321-327, (1986) · Zbl 0608.62086 [25] Renard, D.; Molenberghs, G.; Geys, H., A pairwise likelihood approach to estimation in multilevel probit models, Computational statistics and data analysis, 44, 649-667, (2004) · Zbl 1429.62324 [26] Rodríguez, G.; Goldman, N., An assessment of estimation procedures for multilevel models with binary responses, Journal of the royal statistical society, series A, 158, 73-89, (1995) [27] Skellam, J.G., A probability distribution derived from the binomial distribution regarding the probability of success as variable between the sets of trials, Journal of the royal statistical society, series B, 10, 257-261, (1984) · Zbl 0032.41903 [28] Stram, D.O.; Lee, J.W., Variance components testing in the longitudinal mixed effects model, Biometrics, 50, 1171-1177, (1994) · Zbl 0826.62054 [29] Stram, D.A.; Lee, J.W., Correction to: variance components testing in the longitudinal mixed effects model, Biometrics, 51, 1196, (1995) [30] Vangeneugden, T., Molenberghs, G., Verbeke, G., Demétrio, C., 2010. Marginal correlation from an extended random-effects model for repeated and overdispersed counts. Journal of Applied Statistics (in press). [31] Verbeke, G.; Molenberghs, G., Linear mixed models for longitudinal data, (2000), Springer-Verlag New York · Zbl 0956.62055 [32] Verbeke, G.; Molenberghs, G., The use of score tests for inference on variance components, Biometrics, 59, 254-262, (2003) · Zbl 1210.62013 [33] Williams, D.A., The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity, Biometrics, 31, 949-952, (1975) · Zbl 0333.62069
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.