Joint penalized spline modeling of multivariate longitudinal data, with application to HIV-1 RNA load levels and CD4 cell counts. (English) Zbl 1520.62408

Summary: Motivated by the need to jointly model the longitudinal trajectories of HIV viral load levels and CD4 counts during the primary infection stage, we propose a joint penalized spline modeling approach that can be used to model the repeated measurements from multiple biomarkers of various types (eg, continuous, binary) simultaneously. This approach allows for flexible trajectories for each marker, accounts for potentially time-varying correlation between markers, and is robust to misspecification of knots. Despite its advantages, the application of multivariate penalized spline models, especially when biomarkers may be of different data types, has been limited in part due to its seemingly complexity in implementation. To overcome this, we describe a procedure that transforms the multivariate setting to the univariate one, and then makes use of the generalized linear mixed effect model representation of a penalized spline model to facilitate its implementation with standard statistical software. We performed simulation studies to evaluate the validity and efficiency through joint modeling of correlated biomarkers measured longitudinally compared to the univariate modeling approach. We applied this modeling approach to longitudinal HIV-1 RNA load and CD4 count data from Southern African cohorts to estimate features of the joint distributions such as the correlation and the proportion of subjects with high viral load levels and high CD4 cell counts over time.
{© 2020 The International Biometric Society}


62P10 Applications of statistics to biology and medical sciences; meta analysis


mcglm; brms; MIXED; R; MCMCglmm; SemiPar
Full Text: DOI


[1] Berhane, K. and Tibshirani, R.J. (1998) Generalized additive models for longitudinal data. Canadian Journal of Statistics, 26(4), 517-535. · Zbl 1030.62522
[2] Berridge, D.M. and Crouchley, R. (2011) Multivariate Generalized Linear Mixed Models using R. Boca Raton, FL: CRC Press.
[3] Bonat, W.H. (2018) Multiple response variables regression models in R: the mcglm package. Journal of Statistical Software, 84(1), 1-30.
[4] Boscardin, W.J., Taylor, J.M. and LaD, N. (1998) Longitudinal models for AIDS marker data. Statistical Methods in Medical Research, 7(1), 13-27.
[5] Brown, E.R., Ibrahim, J.G. and DeGruttola, V. (2005) A flexible B‐spline model for multiple longitudinal biomarkers and survival. Biometrics, 61(1), 64-73. · Zbl 1077.62102
[6] Brukner, P.C. (2017) brms: an R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80(1), 1-28.
[7] Brumback, B.A. and Rice, J.A. (1998) Smoothing spline models for the analysis of nested and crossed samples of curves. Journal of the American Statistical Association, 93(443), 961-976. · Zbl 1064.62515
[8] Currie, I.D. and Durban, M. (2002) Flexible smoothing with P‐splines: a unified approach. Statistical Modelling, 2(4), 333-349. · Zbl 1195.62072
[9] Das, K. and Daniels, M.J. (2014) A semiparametric approach to simultaneous covariance estimation for bivariate sparse longitudinal data. Biometrics, 70(1), 33-43. · Zbl 1419.62334
[10] De Boor, C. (1976) Splines as Linear Combinations of B‐Splines. A Survey (No. MRC‐TSR‐1667). Wisconsin Univ Madison Mathematics Research Center.
[11] Ding, A.A. and Wu, H. (1999) Relationships between antiviral treatment effects and biphasic viral decay rates in modeling HIV dynamics. Mathematical Biosciences, 160(1), 63-82. · Zbl 0944.92024
[12] Durbán, M., Harezlak, J., Wand, M.P. and Carroll, R.J. (2005) Simple fitting of subject‐specific curves for longitudinal data. Statistics in Medicine, 24(8), 1153-1167.
[13] Eilers, P.H. and Marx, B.D. (2010) Splines, knots, and penalties. Wiley Interdisciplinary Reviews: Computational Statistics, 2(6), 637-653.
[14] Fitzmaurice, G., Davidian, M., Verbeke, G. and Molenberghs, G. (2009) Longitudinal Data Analysis. Boca Raton, FL: CRC Press.
[15] Ghosh, P. and Hanson, T. (2010) A semiparametric Bayesian approach to multivariate longitudinal data. Australian and New Zealand Journal of Statistics, 52(3), 275-288. · Zbl 1373.62381
[16] Hadfield, J.D. (2010) MCMC methods for multi‐response generalized linear mixed models: the MCMCglmm R package. Journal of Statistical Software, 33(2), 1-22.
[17] Kurum, E., Jeske, D.R., Behrendt, C.E. and Lee, P. (2018) A copula model for joint modeling of longitudinal and time‐invariant mixed outcomes. Statistics in Medicine, 37(27), 3931-3943.
[18] Liang, H., Wu, H. and Carroll, R.J. (2003) The relationship between virologic and immunologic responses in AIDS clinical research using mixed‐effects varying‐coefficient models with measurement error. Biostatistics, 4(2), 297-312. · Zbl 1141.62350
[19] Lin, X. and Zhang, D. (1999) Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society: Series B, 61(2), 381-400. · Zbl 0915.62062
[20] Novitsky, V., Ndung’u, T., Wang, R., Bussmann, H., Chonco, F., Makhema, J., De Gruttola, V., Walker, B.D., Essex, M. (2011) Extended high viremics: a substantial fraction of individuals maintain high plasma viral RNA levels after acute HIV‐1 subtype C infection. AIDS, 25(12), 1515-1522.
[21] Rice, J.A. and Wu, C.O. (2001) Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics, 57(1), 253-259. · Zbl 1209.62061
[22] Ruppert, D., Wand, M.P. and Carroll, R.J. (2003) Semiparametric regression. Cambridge: Cambridge University Press. · Zbl 1038.62042
[23] Segal, B.D., Elliott, M.R., Braun, T. and Jiang, H. (2018) P‐splines with an ℓ_1 penalty for repeated measures. Electronic Journal of Statistics, 12(2), 3554-3600. · Zbl 1408.62078
[24] Slepian, D. (1962) The one‐sided barrier problem for Gaussian noise. Bell System Technical Journal, 41(2), 463-501.
[25] Teixeira‐Pinto, A. and Normand, S.L.T. (2009) Correlated bivariate continuous and binary outcomes: issues and applications. Statistics in Medicine, 28(13), 1753-1773.
[26] Thiébaut, R., Jacqmin‐Gadda, H., Babiker, A., Commenges, D., and the CASCADE Collaboration, (2005) Joint modelling of bivariate longitudinal data with informative dropout and left‐censoring, with application to the evolution of CD4+ cell count and HIV RNA viral load in response to treatment of HIV infection. Statistics in Medicine, 24, 65-82.
[27] Thiébaut, R., Jacqmin‐Gadda, H., Chêne, G., Leport, C. and Commenges, D. (2002) Bivariate linear mixed models using SAS proc MIXED. Computer Methods and Programs in Biomedicine, 69(3), 249-256.
[28] Tibshirani, R.J. (2014) Adaptive piecewise polynomial estimation via trend filtering. The Annals of Statistics, 42(1), 285-323. · Zbl 1307.62118
[29] Verbeke, G., Fieuws, S., Molenberghs, G. and Davidian, M. (2014) The analysis of multivariate longitudinal data: a review. Statistical Methods in Medical Research, 23(1), 42-59.
[30] Verbyla, A.P., Cullis, B.R., Kenward, M.G. and Welham, S.J. (1999) The analysis of designed experiments and longitudinal data by using smoothing splines. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(3), 269-311. · Zbl 0956.62062
[31] Wand, M.P. (2003) Smoothing and mixed models. Computational Statistics, 18(2), 223-249. · Zbl 1050.62049
[32] Wang, Y. (1998) Mixed effects smoothing spline analysis of variance. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(1), 159-174. · Zbl 0909.62034
[33] Wang, L., Li, H. and Huang, J.Z. (2008) Variable selection in nonparametric varying‐coefficient models for analysis of repeated measurements. Journal of the American Statistical Association, 103(484), 1556-1569. · Zbl 1286.62034
[34] Zhou, L., Huang, J.Z. and Carroll, R.J. (2008) Joint modelling of paired sparse functional data using principal components. Biometrika, 95(3), 601-619. · Zbl 1437.62676
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.