Fast covariance estimation for sparse functional data. (English) Zbl 1384.62142

Summary: Smoothing of noisy sample covariances is an important component in functional data analysis. We propose a novel covariance smoothing method based on penalized splines and associated software. The proposed method is a bivariate spline smoother that is designed for covariance smoothing and can be used for sparse functional or longitudinal data. We propose a fast algorithm for covariance smoothing using leave-one-subject-out cross-validation. Our simulations show that the proposed method compares favorably against several commonly used methods. The method is applied to a study of child growth led by one of coauthors and to a public dataset of longitudinal CD4 counts.


62G08 Nonparametric regression and quantile regression
62H25 Factor analysis and principal components; correspondence analysis


face; mgcv; SemiPar; refund
Full Text: DOI arXiv


[1] Besse, P; Ramsay, JO, Principal components analysis of sampled functions, Psychometrika, 51, 285-311, (1986) · Zbl 0623.62048
[2] Besse, P; Cardot, H; Ferraty, F, Simultaneous nonparametric regressions of unbalanced longitudinal data, Comput. Stat. Data Anal., 24, 255-270, (1997) · Zbl 0900.62199
[3] Cai, T., Yuan, M.: Nonparametric Covariance Function Estimation for Functional and Longitudinal Data. Technical report, University of Pennsylvania, Philadelphia, PA (2012) · Zbl 1117.62451
[4] Cederbaum, J; Pouplier, M; Hoole, P; Greven, S, Functional linear mixed models for irregularly or sparsely sampled data, Stat. Model., 16, 67-88, (2016)
[5] Chen, H; Wang, Y, A penalized spline approach to functional mixed effects model analysis, Biometrics, 67, 861-870, (2011) · Zbl 1226.62030
[6] de Boor, C.: A Practical Guide to Splines. Springer, Berlin (1978) · Zbl 0406.41003
[7] Diggle, P., Heagerty, P., Liang, K.-Y., Zeger, S.: Analysis of Longitudinal Data. Oxford University Press, Oxford (1994) · Zbl 1268.62001
[8] Durban, M; Harezlak, J; Wand, MP; Carroll, RJ, Simple Fitting of subject-specific curves for longitudinal data, Stat. Med., 24, 1153-1167, (2005)
[9] Eilers, P; Marx, B, Flexible smoothing with B-splines and penalties (with discussion), Stat. Sci., 11, 89-121, (1996) · Zbl 0955.62562
[10] Eilers, P; Marx, B, Multivariate calibration with temperature interaction using two-dimensional penalized signal regression, Chemom. Intell. Lab. Syst., 66, 159-174, (2003)
[11] Fan, J., Gijbels, I.: Local Polynomial Modelling and its Applications. Chapman & Hall, London (1996) · Zbl 0873.62037
[12] Goldsmith, J; Bobb, J; Crainiceanu, C; Caffo, B; Reich, D, Penalized functional regression, J. Comput. Graph. Stat., 20, 830-851, (2010)
[13] Goldsmith, J; Greven, S; Crainiceanu, C, Corrected confidence bands for functional data using principal components, Biometrics, 69, 41-51, (2013) · Zbl 1274.62776
[14] Huang, L., Scheipl, F., Goldsmith, J., Gellar, J., Harezlak, J., Mclean, M., Swihart, B., Xiao, L., Crainiceanu, C., Reiss, P., Chen, Y., Greven, S., Huo, L., Kundu, M., Wrobel, J.: R package mgcv: Methodology for regression with functional data (version 0.1-13). https://cran.r-project.org/web/packages/refund/index.html (2015)
[15] James, G; Hastie, T; Sugar, C, Principal component models for sparse functional data, Biometrika, 87, 587-602, (2000) · Zbl 0962.62056
[16] Kaslow, RA; Ostrow, DG; Detels, R; Phair, JP; Polk, BF; Rinaldo, CR, The multicenter aids cohort study: rationale, organization, and selected characteristics of the participants, Am. J. Epidemiol., 126, 310-318, (1987)
[17] Kneip, A, Nonparametric estimation of common regressors for similar curve data, Ann. Stat., 22, 1386-1427, (1994) · Zbl 0817.62029
[18] Peng, J; Paul, D, A geometric approach to maximum likelihood estimation of functional principal components from sparse longitudinal data, J. Comput. Graph. Stat., 18, 995-1015, (2009)
[19] Ramsay, J; Dalzell, CJ, Some tools for functional data analysis (with discussion), J. R. Stat. Soc. B, 53, 539-572, (1991) · Zbl 0800.62314
[20] Reiss, P.T., Todd Ogden, R.: Smoothing parameter selection for a class of semiparametric linear models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 71(2), 505-523 (2009) · Zbl 1248.62057
[21] Reiss, P; Huang, L; Mennes, M, Fast function-on-scalar regression with penalized basis expansions, Int. J. Biostat., 6, 28, (2010)
[22] Rodríguez-Álvarez, MX; Lee, D-J; Kneib, T; Durbán, M; Eilers, P, Fast smoothing parameter separation in multidimensional generalized p-splines: the sap algorithm, Stat. Comput., 25, 941-957, (2015) · Zbl 1332.62139
[23] Rodríguez-Álvarez, M. X., Durbán, M., Lee, D.-J., Eilers, P.: Fast estimation of multidimensional adaptive P-spline models. http://arxiv.org/pdf/1610.06861.pdf (2016) · Zbl 0962.62056
[24] Ruppert, D., Wand, M., Carroll, R.: Semiparametric Regression. Cambridge University Press, Cambridge (2003) · Zbl 1038.62042
[25] Scheipl, F; Staicu, A-M; Greven, S, Functional additive mixed models, J. Comput. Graph. Stat., 24, 477-501, (2015)
[26] Seber, G.: A Matrix Handbook for Statisticians. Wiley-Interscience, Hoboken (2007)
[27] Staniswalis, J; Lee, J, Nonparametric regression analysis of longitudinal data, J. Am. Stat. Assoc., 93, 1403-1418, (1998) · Zbl 1064.62522
[28] Wood, S, Thin plate regression splines, J. R. Stat. Soc. B, 65, 95-114, (2003) · Zbl 1063.62059
[29] Wood, S.N.: Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73(1), 3-36 (2011) · Zbl 1411.62089
[30] Wood, S.: R package mgcv: mixed GAM computation vehicle with GCV/AIC/REML, smoothese estimation (version 1.7-24). http://cran.r-project.org/web/packages/mgcv/index.html (2013) · Zbl 1064.62522
[31] Wood, S.N., Fasiolo, M.: A generalized Fellner-Schall method for smoothing parameter optimization with application to tweedie location, scale and shape models. Biometrics (2017) doi:10.1111/biom.12666 · Zbl 1405.62216
[32] Xiao, L., Li, Y., Ruppert, D.: Fast bivariate P-splines: the sandwich smoother. J. R. Stat. Soc. B 75, 577-599 (2013) · Zbl 1411.62109
[33] Xiao, L; Huang, L; Schrack, J; Ferrucci, L; Zipunnikov, V; Crainiceanu, C, Quantifying the life-time Circadian rhythm of physical activity: a covariate-dependent functional approach, Biostatistics, 16, 352-367, (2015)
[34] Xiao, L; Ruppert, D; Zipunnikov, V; Crainiceanu, C, Fast covariance function estimation for high-dimensional functional data, Stat. Comput., 26, 409-421, (2016) · Zbl 1342.62094
[35] Xiao, L., Li, C., Checkley, W., Crainiceanu, C.: R package face: fast covariance estimation for sparse functional data (version 0.1-3). https://cran.r-project.org/web/packages/face/index.html (2017) · Zbl 1332.62139
[36] Xu, G; Huang, J, Asymptotic optimality and efficient computation of the leave-subject-out cross-validation, Ann. Stat., 40, 3003-3030, (2012) · Zbl 1296.62096
[37] Yao, F; Müller, H; Clifford, A; Dueker, S; Follett, J; Lin, Y; Buchholz, B; Vogel, J, Shrinkage estimation for functional principal component scores with application to the population kinetics of plasma folate, Biometrics, 20, 852-873, (2003) · Zbl 1210.62076
[38] Yao, F; Müller, H; Wang, J, Functional data analysis for sparse longitudinal data, J. Am. Stat. Assoc., 100, 577-590, (2005) · Zbl 1117.62451
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.