×

Selecting mixed-effects models based on a generalized information criterion. (English) Zbl 1085.62083

Summary: The generalized information criterion (GIC) proposed by C. R. Rao and Y. Wu [A strongly consistent procedure for model selection in a regression problem. Biometrika 76, 369-374 (1989; Zbl 0669.62051)] is a generalization of Akaike’s information criterion (AIC) and the Bayesian information criterion (BIC). We extend the GIC to select linear mixed-effects models that are widely applied in analyzing longitudinal data. A procedure for selecting fixed effects and random effects based on the extended GIC is provided. The asymptotic behavior of the extended GIC method for selecting fixed effects is studied. We prove that, under mild conditions, the selection procedure is asymptotically loss efficient regardless of the existence of a true model and consistent if a true model exists. A simulation study is carried out to empirically evaluate the performance of the extended GIC procedure. The results from the simulation show that if the signal-to-noise ratio is moderate or high, the percentages of choosing the correct fixed effects by the GIC procedure are close to one for finite samples, while the procedure performs relatively poorly when it is used to select random effects.

MSC:

62J05 Linear regression; mixed models
62F12 Asymptotic properties of parametric estimators
62B10 Statistical aspects of information-theoretic topics
62F07 Statistical ranking and selection procedures
62F05 Asymptotic properties of parametric tests
62H12 Estimation in multivariate analysis

Citations:

Zbl 0669.62051
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Akaike, H., Statistical predictor identification, Ann. Inst. Statist. Math., 22, 203-217 (1970) · Zbl 0259.62076
[2] Akaike, H., Information theory and an extension of the maximum likelihood principle, (Petrov, B. N.; Csáki, F., Second International symposium on Information Theory (1973), Akadémiai Kiado: Akadémiai Kiado Budapest), 267-281 · Zbl 0283.62006
[3] Allen, D. M., The relationship between variable selection and data augmentation and a method for prediction, Technometrics, 16, 125-127 (1974) · Zbl 0286.62044
[4] Burman, P., A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods, Biometrika, 76, 503-514 (1989) · Zbl 0677.62065
[5] Craven, P.; Wahba, G., Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation, Numer. Math., 31, 377-403 (1979) · Zbl 0377.65007
[6] Davidian, M.; Giltinan, D. M., Nonlinear Models for Repeated Measurement Data (1995), Chapman & Hall/Crc: Chapman & Hall/Crc New York
[7] Dempster, A. P.; Laird, N. M.; Rubin, B., Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. Roy. Statist. Soc., Ser. B, 39, 1-38 (1977) · Zbl 0364.62022
[8] Fuller, W. A.; Battese, G. E., Transformations for estimation of linear models with nested-error structure, J. Amer. Statist. Assoc., 68, 626-632 (1973) · Zbl 0271.62087
[9] Geisser, S., The predictive sample reuse method with applications, J. Amer. Statist. Assoc., 70, 320-328 (1975) · Zbl 0321.62077
[10] Goldstein, H., The Design and Analysis of Longitudinal Studies (1979), Academic Press: Academic Press London
[11] Grizzle, J. E.; Allen, D. M., Analysis of growth and dose response curves, Biometrika, 25, 357-382 (1969)
[12] Hannan, E. J.; Quinn, B. G., The determination of the order of an autoregression, J. Roy. Statist. Soc. Ser. B, 41, 190-195 (1979) · Zbl 0408.62076
[13] Harville, D. A., Bayesian inference for variance components using only error contrasts, Biometrika, 61, 383-385 (1974) · Zbl 0281.62072
[14] Harville, D. A., Extensions of the Gauss-Markov theorem to include the estimation of random effects, Ann. Statist., 4, 384-395 (1976) · Zbl 0323.62043
[15] Harville, D. A., Maximum likelihood approaches to variance component estimation and to related problems, J. Amer. Statist. Assoc., 72, 320-340 (1977) · Zbl 0373.62040
[16] Laird, N. M.; Ware, J. H., Random effects models for longitudinal data, Biometrics, 38, 963-974 (1982) · Zbl 0512.62107
[17] Mallows, C. L., Some comments on \(C_p\), Technometrics, 15, 661-675 (1973) · Zbl 0269.62061
[18] Matsuba, I., Generalized information criterion for linear and nonlinear processes, Internat. J. Bifurcation Chaos, 12, 389-395 (2002)
[19] Nishii, R., Asymptotic properties of criteria for selection of variables in multiple regression, Ann. Statist., 12, 758-765 (1984) · Zbl 0544.62063
[20] Pötscher, B. M., Model selection under nonstationary: autoregressive models and stochastic linear regression models, Ann. Statist., 17, 1257-1274 (1989) · Zbl 0683.62049
[21] Pötscher, B. M., Effects of Model Selection on Inference, Econometr. Theory, 7, 163-185 (1991)
[22] Potthoff, R. F.; Roy, S. N., A generalized multivariate analysis of variance model useful especially for growth curve problems, Biometrika, 51, 313-326 (1964) · Zbl 0138.14306
[23] Rao, C. R., Some problems involving linear hypotheses in multivariate analysis, Biometrika, 46, 49-58 (1959) · Zbl 0108.15405
[24] Rao, C. R., The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves, Biometrika, 52, 447-458 (1965) · Zbl 0203.21501
[25] Rao, C. R.; Wu, Y., A strongly consistent procedure for model selection in a regression problem, Biometrika, 76, 369-374 (1989) · Zbl 0669.62051
[26] Rissanen, J., Stochastic complexity and modelling, Ann. Statist., 14, 1080-1100 (1986) · Zbl 0602.62008
[27] Schmidt, P., Econometrics (1976), Marcel Dekker: Marcel Dekker New York · Zbl 0353.62069
[28] Schwartz, G., Estimating the dimensions of a model, Ann. Statist., 6, 461-464 (1978)
[29] Self, S. G.; Liang, K-Y., Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions, J. Amer. Statist. Assoc., 82, 605-610 (1987) · Zbl 0639.62020
[30] Shao, J., Linear model selection by cross-validation, J. Amer. Statist. Assoc., 88, 486-494 (1993) · Zbl 0773.62051
[31] Shao, J., An asymptotic theory for linear model selection, Statist. Sinica, 7, 221-242 (1997)
[32] Shibata, R., Approximate efficiency of a selection procedure for the number of regression variables, Biometrika, 71, 43-49 (1984) · Zbl 0543.62053
[33] Stone, M., Cross-validatory choice and assessment of statistical predictions, J. Roy. Statist. Soc. Ser. B, 36, 111-147 (1974) · Zbl 0308.62063
[34] Stram, D. O.; Lee, J. W., Variance components testing in the longitudinal mixed effects model, Biometrics, 50, 1171-1177 (1994) · Zbl 0826.62054
[35] Wei, C. Z., On predictive least squares principles, Ann. Statist., 20, 1-42 (1992) · Zbl 0801.62083
[36] Whittle, P., Bounds for the moments of linear and quadratic forms in independent variables, Theory Probab. Appl., 5, 302-305 (1960)
[37] Zhang, P., Model selection via multifold cross validation, Ann. Statist., 21, 299-313 (1993) · Zbl 0770.62053
[38] Zhang, S.; Niu, X-F.; Ang, J., Building tracking portfolios based on a Generalized Information Criterion, Statist. Sinica, 13, 1075-1096 (2003) · Zbl 1034.62108
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.