×

Principal components analysis for sparsely observed correlated functional data using a kernel smoothing approach. (English) Zbl 1274.62412

Summary: We consider the problem of functional principal component analysis for correlated functional data. In particular, we focus on a separable covariance structure and consider irregularly and possibly sparsely observed sample trajectories. By observing that under the sparse measurements setting, the empirical covariance of pre-smoothed sample trajectories is a highly biased estimator along the diagonal, we propose to modify the empirical covariance by estimating the diagonal and off-diagonal parts of the covariance kernel separately. We prove that under a separable covariance structure, this method can consistently estimate the eigenfunctions of the covariance kernel. We also quantify the role of the correlation in the \(L^{2}\) risk of the estimator, and show that under a weak correlation regime, the risk achieves the optimal nonparametric rate when the number of measurements per curve is bounded.

MSC:

62H25 Factor analysis and principal components; correspondence analysis
62G20 Asymptotic properties of nonparametric inference

Software:

fda (R)
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Ash, R. B. (1972)., Real Analysis and Probability , Academic Press. · Zbl 0249.28001
[2] Banerjee, S. and Johnson, G. A. (2006). Coregionalized single- and multiresolution spatially varying growth curve modeling with application to weed growth., Biometrics 61 , 617-625. · Zbl 1111.62115 · doi:10.1111/j.1541-0420.2006.00535.x
[3] Besse, P., Cardot, H. and Ferraty, F. (1997). Simultaneous nonparametric regression of unbalanced longitudinal data., Computational Statistics and Data Analysis 24 , 255-270. · Zbl 0900.62199 · doi:10.1016/S0167-9473(96)00067-9
[4] Cai, T. and Hall, P. (2006). Prediction in functional linear regression., Annals of Statistics 34 , 2159-2179. · Zbl 1106.62036 · doi:10.1214/009053606000000830
[5] Cardot, H., Ferraty F. and Sarda P. (1999). Functional Linear Model., Statistics and Probability Letters 45 , 11-22. · Zbl 0962.62081 · doi:10.1016/S0167-7152(99)00036-X
[6] Cardot, H. (2000). Nonparametric estimation of smoothed principal components analysis of sampled noisy functions., Journal of Nonparametric Statistics 12 , 503-538. · Zbl 0951.62030 · doi:10.1080/10485250008832820
[7] Chui, C. (1987)., Multivariate Splines . SIAM. · Zbl 0628.41006
[8] Chiou, J.-M. and Li, P.-L. (2007). Functional clustering and identifying substructures of longitudinal data., Journal of the Royal Statistical Society, Series B , 69 , 679-699. · doi:10.1111/j.1467-9868.2007.00605.x
[9] Chen, K., Paul, D. and Wang, J.-L. (2009). Properties of principal component analysis for correlated data., Technical Report , University of California, Davis.
[10] Ferraty, F. and Vieu, P. (2006)., Nonparametric Functional Data Analysis: Theory and Practice . Springer. · Zbl 1119.62046 · doi:10.1007/0-387-36620-2
[11] Gelfand, A. E., Schmidt, A., Banerjee, S. and Sirmans, C. F. (2004). Nonstationary multivariate process modeling through spatially varying coregionalization (with discussion)., Test 13 , 1-50. · Zbl 1069.62074 · doi:10.1007/BF02595775
[12] Hall, P. and Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression., Annals of Statistics 35 , 70-91. · Zbl 1114.62048 · doi:10.1214/009053606000000957
[13] Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis., Annals of Statistics 34 , 1493-1517. · Zbl 1113.62073 · doi:10.1214/009053606000000272
[14] Hlubinka, D. and Prchal, L. (2007). Changes in atmospheric radiation from the statistical point of view., Computational Statistics and Data Analysis 51 , 4926-4941. · Zbl 1162.62442 · doi:10.1016/j.csda.2006.07.030
[15] James, G. M., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data., Biometrika , 87 , 587-602. · Zbl 0962.62056 · doi:10.1093/biomet/87.3.587
[16] James, G. M. and Hastie, T. (2001). Functional linear discriminant analysis for irregularly sampled curves., Journal of the Royal Statistical Society, Series B , 64 , 411-432. · Zbl 0989.62036 · doi:10.1111/1467-9868.00297
[17] James, G. M. and Sugar, C. A. (2003). Clustering for sparsely sampled functional data., Journal of the American Statistical Association , 98 , 397-408. · Zbl 1041.62052 · doi:10.1198/016214503000189
[18] Kato, T. (1980)., Perturbation Theory of Linear Operators . Springer-Verlag. · Zbl 0435.47001
[19] Kneip, A. and Utikal, K. J. (2001). Inference for density families using functional principal component analysis, Journal of the American Statistical Association , 96 , 519-542. · Zbl 1019.62060 · doi:10.1198/016214501753168235
[20] Nica, A. and Speicher, R. (2006)., Lectures on the Combinatorics of Free Probability . Cambridge University Press. · Zbl 1133.60003 · doi:10.1017/CBO9780511735127
[21] Paul, D. and Johnstone, I. M. (2007). Augmented sparse principal component analysis for high dimensional data., Technical Report . ( )
[22] Paul, D. and Peng, J. (2008). Principal components analysis for sparsely observed correlated functional data using a kernel smoothing approach., Technical report. · Zbl 1274.62412 · doi:10.1214/11-EJS662
[23] Paul, D. and Peng, J. (2009). Consistency of restricted maximum likelihood estimators of principal components., Annals of Statistics , 37 , 1229-1271. · Zbl 1161.62032 · doi:10.1214/08-AOS608
[24] Peng, J. and Paul, D. (2009). A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data., Journal of Computational and Graphical Statistics , 18 , 995-1015. · doi:10.1198/jcgs.2009.08011
[25] Peng, J. and Müller, H.-G. (2008). Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. To appear in, Annals of Applied Statistics . · Zbl 1149.62053 · doi:10.1214/08-AOAS172
[26] Ramsay, J. and Silverman, B. W. (2005)., Functional Data Analysis, 2nd Edition . Springer. · Zbl 1079.62006
[27] Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed effects models for unequally sampled noisy curves., Biometrics , 57 , 253-259. · Zbl 1209.62061 · doi:10.1111/j.0006-341X.2001.00253.x
[28] Spellman, P.T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast, saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell , 9 , 3273-3297.
[29] Tang, R. and Müller, H.-G. (2008). Time-synchronized clustering of gene expression trajectories., Biostatistics , 10 , 32-45
[30] Yao, F. and Lee, T. C. M. (2006). Penalized spline models for functional principal component analysis., Journal of the Royal Statistical Society, Series B 68 , 3-25. · Zbl 1141.62050 · doi:10.1111/j.1467-9868.2005.00530.x
[31] Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data., Journal of the American Statistical Association 100 , 577-590. · Zbl 1117.62451 · doi:10.1198/016214504000001745
[32] Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional linear regression for longitudinal data., Annals of Statistics 33 , 2873-2903. · Zbl 1084.62096 · doi:10.1214/009053605000000660
[33] Yuan, M. and Cai, T. T. (2010). A reproducing kernel Hilbert space approach to functional linear regression., Annals of Statistics 38 , 3412-3444. · Zbl 1204.62074 · doi:10.1214/09-AOS772
[34] Wackernagel, H. (2003)., Multivariate Geostatistics, 3rd Edition . Springer. · Zbl 1015.62128
[35] Zhang, J. T. and Chen, J. (2007). Statistical inferences for functional data., Annals of Statistics , 35 , 1052-1079. · Zbl 1129.62029 · doi:10.1214/009053606000001505
[36] Zhou, L., Huang, J., Martinez, J. G., Maity, A., Baladandayuthapani, V. and Carroll, R. C. (2010). Reduced rank mixed effects models for spatially correlated hierarchical functional data., Journal of the American Statistical Association 105 , 390-400. · Zbl 06444905 · doi:10.1198/jasa.2010.tm08737
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.