×

Penalized regression, mixed effects models and appropriate modelling. (English) Zbl 1327.62256

Summary: Linear mixed effects methods for the analysis of longitudinal data provide a convenient framework for modelling within-individual correlation across time. Using spline functions allows for flexible modelling of the response as a smooth function of time. A computational connection between linear mixed effects modelling and spline smoothing has resulted in a cross-fertilization of these two fields. The connection has popularized the use of spline functions in longitudinal data analysis and the use of mixed effects software in smoothing analyses. However, care must be taken in exploiting this connection, as resulting estimates of the underlying population mean might not track the data well and associated standard errors might not reflect the true variability in the data. We discuss these shortcomings and suggest some easy-to-compute methods to eliminate them.

MSC:

62G08 Nonparametric regression and quantile regression
62J99 Linear inference, regression
PDFBibTeX XMLCite
Full Text: DOI Euclid

References:

[1] B. A. Brumback, L. C. Brumback, and M. J. Lindstrom. Longitudinal Data Analysis , pages 291-318. Fitzmaurice, G., Davidian, M., Verbeke, G. & Molenberghs, G., eds. Handbooks of Modern Statistical Methods. Chapman & Hall/CRC Press, Boca Raton, Florida, 2009.
[2] Ciprian M. Crainiceanu, David Ruppert, Raymond J. Carroll, Adarsh Joshi, and Billy Goodner. Spatially adaptive Bayesian penalized splines with heteroscedastic errors. Journal of Computational and Graphical Statistics , 16(2):265-88, 2007.
[3] Eugene Demidenko. Mixed Models: Theory and Applications . Wiley Series in Probability and Statistics. Wiley-Interscience, Hoboken, NJ, 2004. · Zbl 1055.62086
[4] Viani A. B. Djeundje and Iain D. Currie. Appropriate covariance-specification via penalties for penalized splines in mixed models for longitudinal data. Electronic Journal of Statistics , 4:1202-1224, 2010. · Zbl 1329.62198
[5] M. Durban, J. Harezlak, M. P. Wand, and R. J. Carroll. Simple fitting of subject-specific curves for longitudinal data. Statistics in Medicine , 24(8):1153-67, 2005.
[6] Paul H. C. Eilers and Brian D. Marx. Flexible smoothing with \(B\)-splines and penalties. Statistical Science , 11(2):89-121, 1996. · Zbl 0955.62562
[7] Paul H. C. Eilers and Brian D. Marx. Splines, knots and penalties . Wiley Interdisciplinary Reviews: Computational Statistics. 2010. · Zbl 0955.62562
[8] Garrett M. Fitzmaurice, Nan M. Laird, and James H. Ware. Applied Longitudinal Analysis . Wiley Series in Probability and Statistics. Wiley-Interscience, Hoboken, NJ, 2004. · Zbl 1057.62052
[9] D. G. Folk and T. J. Bradley. The evolution of recovery from desiccation stress in laboratory-selected populations of drosophila melanogaster. The Journal of Experimental Biology , 207:2671-2678, 2004.
[10] J. H. Friedman. Multivariate adaptive regression splines (with discussion). Annals of Statististics , 19:1-141, 1991. · Zbl 0765.62064
[11] A. Gilmour, B. Gogel, B. R. Cullis, and R. Thompson. ASReml User Guide Release 2.0 . VSN International Ltd., Hemel Hempstead, U.K., 2006.
[12] P. J. Green and B. W. Silverman. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach . Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1994. · Zbl 0832.62032
[13] N. Heckman, R. Lockhart, and J. D. Nielsen, Supplementary Material to “Regression, Mixed Effects Models and Appropriate Modelling”. DOI: 10.1214/00-EJS809SUPP. · Zbl 1327.62256
[14] J. S. Hodges and D. J. Sargent. Counting degrees of freedom in hierarchical and other richly-parameterised models. Biometrika , 88:367-79, 2001. · Zbl 0984.62045
[15] A. E. Huisman, R. F. Veerkamp, and J. A. M. Van Arendonk. Genetic parameters for various random regression models to describe the weight data of pigs. Journal of Animal Science , 80:575-82, 2002.
[16] Raghu N. Kackar and David A. Harville. Approximations for standard errors of estimators of fixed and random effect in mixed linear models. Journal of the American Statistical Association , 79:853-862, 1984. · Zbl 0557.62066
[17] George S. Kimeldorf and Grace Wahba. A correspondence between bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics , 41:495-502, 1970. · Zbl 0193.45201
[18] Kung Yee Liang and Scott L. Zeger. Longitudinal data analysis using generalized linear models. Biometrika , 73(1):13-22, 1986. · Zbl 0595.62110
[19] Karin Meyer. Random regression analyses using \(B\)-splines to model growth of Australian Angus cattle. Genetics Selection Evolution , 37(5):473-500, 2005.
[20] Karin Meyer. WOMBAT - a tool for mixed model analyses in quantitative genetics by REML. Journal of Zheijang University Science B , 8:815-21, 2007.
[21] L. Ngo and M. P. Wand. Smoothing with mixed model software. Journal of Statistical Software , 9:1-54, 2004.
[22] J. O. Ramsay and B. W. Silverman. Functional Data Analysis . Springer Series in Statistics. Springer, New York, second edition, 2005. · Zbl 1079.62006
[23] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning . The MIT Press, 2006. · Zbl 1177.68165
[24] John A. Rice and Colin O. Wu. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics , 57(1):253-9, 2001. · Zbl 1209.62061
[25] Christèle Robert-Granié, Barbara Heude, and Jean-Louis Foulley. Modelling the growth curve of Maine-Anjou beef cattle using heteroskedastic random coefficients models. Genetics Selection Evolution , 34(4):423-45, 2002.
[26] G. K. Robinson. That BLUP is a good thing: the estimation of random effects. Statistical Science , 6(1):15-51, 1991. · Zbl 0955.62500
[27] David Ruppert, M. P. Wand, and R. J. Carroll. Semiparametric Regression . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2003. · Zbl 1038.62042
[28] David Ruppert, M. P. Wand, and R. J. Carroll. Semiparametric regression during 2003-2007. Electronic Journal of Statistics , 3:1192-1256, 2010. · Zbl 1326.62094
[29] Andrew D. A. C. Smith and M. P. Wand. Streamlined variance calculations for semiparametric mixed models. Statistics in Medicine , 27(3):435-48, 2008.
[30] C. J. Stone, M. Hansen, C. Kooperberg, and Y. K. Truong. Polynomial splines and their tensor products in extended linear modeling. Annals of Statististics , 25:1371-1425, 1997. · Zbl 0924.62036
[31] Yan Sun, Wenyang Zhang, and Howell Tong. Estimation of the covariance matrix of random effects in longitudinal studies. The Annals of Statistics , 35(6):2795-2814, 2007. · Zbl 1129.62053
[32] A. A. Szpiro, K. M. Rice, and T. Lumley. Model-robust regression and Bayesian ‘sandwich’ estimator. Annals of Applied Statistics , · Zbl 1220.62025
[33] A. P. Verbyla, B. R. Cullis, M. G. Kenward, and S. J. Welham. The analysis of designed experiments and longitudinal data by using smoothing splines. Journal of The Royal Statistical Society Series C , 48(3):269-311, 1999. · Zbl 0956.62062
[34] Sue J. Welham, Brian R. Cullis, Michael G. Kenward, and Robin Thompson. A comparison of mixed model splines for curve fitting. Australian & New Zealand Journal of Statistics , 49(1):1-23, 2007. · Zbl 1117.62041
[35] I. M. S. White, R. Thompson, and S. Brotherstone. Genetic and environmental smoothing of lactation curves with cubic splines. Journal of Dairy Science , 82:632-8, 1999.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.