Functional data analysis by matrix completion.

*(English)*Zbl 1416.62324Summary: Functional data analyses typically proceed by smoothing, followed by functional PCA. This paradigm implicitly assumes that rough variation is due to nuisance noise. Nevertheless, relevant functional features such as time-localised or short scale fluctuations may indeed be rough relative to the global scale, but still smooth at shorter scales. These may be confounded with the global smooth components of variation by the smoothing and PCA, potentially distorting the parsimony and interpretability of the analysis. The goal of this paper is to investigate how both smooth and rough variations can be recovered on the basis of discretely observed functional data. Assuming that a functional datum arises as the sum of two uncorrelated components, one smooth and one rough, we develop identifiability conditions for the recovery of the two corresponding covariance operators. The key insight is that they should possess complementary forms of parsimony: one smooth and finite rank (large scale), and the other banded and potentially infinite rank (small scale). Our conditions elucidate the precise interplay between rank, bandwidth and grid resolution. Under these conditions, we show that the recovery problem is equivalent to rank-constrained matrix completion, and exploit this to construct estimators of the two covariances, without assuming knowledge of the true bandwidth or rank; we study their asymptotic behaviour, and then use them to recover the smooth and rough components of each functional datum by best linear prediction. As a result, we effectively produce separate functional PCAs for smooth and rough variation.

##### MSC:

62H25 | Factor analysis and principal components; correspondence analysis |

15A83 | Matrix completion problems |

##### Keywords:

analyticity; banding; covariance operator; functional PCA; low rank; resolution; scale; smoothing##### Software:

fda (R)
PDF
BibTeX
XML
Cite

\textit{M.-H. Descary} and \textit{V. M. Panaretos}, Ann. Stat. 47, No. 1, 1--38 (2019; Zbl 1416.62324)

**OpenURL**

##### References:

[1] | Bauschke, H. H. and Borwein, J. M. (1996). On projection algorithms for solving convex feasibility problems. SIAM Rev.38 367–426. · Zbl 0865.47039 |

[2] | Bosq, D. (2000). Linear Processes in Function Spaces: Theory and Applications. Lecture Notes in Statistics149. Springer, New York. · Zbl 0962.60004 |

[3] | Bosq, D. (2014). Computing the best linear predictor in a Hilbert space. Applications to general ARMAH processes. J. Multivariate Anal.124 436–450. · Zbl 1278.62153 |

[4] | Chen, Y. and Wainwright, M. J. (2015). Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees. Available at arXiv:1509.03025. |

[5] | Coleman, T. F. and Li, Y. (1994). On the convergence of interior-reflective Newton methods for nonlinear minimization subject to bounds. Math. Program.67 189–224. · Zbl 0842.90106 |

[6] | Coleman, T. F. and Li, Y. (1996). An interior trust region approach for nonlinear minimization subject to bounds. SIAM J. Optim.6 418–445. · Zbl 0855.65063 |

[7] | Dauxois, J., Pousse, A. and Romain, Y. (1982). Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference. J. Multivariate Anal.12 136–154. · Zbl 0539.62064 |

[8] | Descary, M.-H. and Panaretos, V. M. (2018). Supplement to “Functional data analysis by matrix completion.” DOI:10.1214/17-AOS1590SUPP. |

[9] | Hall, P., Müller, H.-G. and Wang, J.-L. (2006). Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist.34 1493–1517. · Zbl 1113.62073 |

[10] | Horváth, L. and Kokoszka, P. (2012). Inference for Functional Data with Applications. Springer, New York. |

[11] | Hsing, T. and Eubank, R. (2015). Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators. Wiley, Chichester. · Zbl 1338.62009 |

[12] | Király, F. and Tomioka, R. (2012). A combinatorial algebraic approach for the identifiability of low-rank matrix completion. In Proceedings of the 29th International Conference on Machine Learning. |

[13] | Krantz, S. G. and Parks, H. R. (2002). A Primer of Real Analytic Functions, 2nd ed. Birkhäuser, Boston, MA. · Zbl 1015.26030 |

[14] | Li, Y. and Hsing, T. (2010). Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann. Statist.38 3321–3351. · Zbl 1204.62067 |

[15] | Opsomer, J., Wang, Y. and Yang, Y. (2001). Nonparametric regression with correlated errors. Statist. Sci.16 134–153. · Zbl 1059.62537 |

[16] | Panaretos, V. M. and Tavakoli, S. (2013). Cramér–Karhunen–Loève representation and harmonic principal component analysis of functional time series. Stochastic Process. Appl.123 2779–2807. · Zbl 1285.62109 |

[17] | Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York. · Zbl 1079.62006 |

[18] | van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York. · Zbl 0862.60002 |

[19] | Wang, J. L., Chiou, J. M. and Müller, H.-G. (2015). Review of functional data analysis. Available at arXiv:1507.05135. |

[20] | Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc.100 577–590. · Zbl 1117.62451 |

[21] | Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional linear regression analysis for longitudinal data. Ann. Statist.33 2873–2903. · Zbl 1084.62096 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.