×

A notion of depth for sparse functional data. (English) Zbl 1474.62462

Summary: Data depth is a well-known and useful nonparametric tool for analyzing functional data. It provides a novel way of ranking a sample of curves from the center outwards and defining robust statistics, such as the median or trimmed means. It has also been used as a building block for functional outlier detection methods and classification. Several notions of depth for functional data were introduced in the literature in the last few decades. These functional depths can only be directly applied to samples of curves measured on a fine and common grid. In practice, this is not always the case, and curves are often observed at sparse and subject dependent grids. In these scenarios, the usual approach consists in estimating the trajectories on a common dense grid, and using the estimates in the depth analysis. This approach ignores the uncertainty associated with the curves estimation step. Our goal is to extend the notion of depth so that it takes into account this uncertainty. Using both functional estimates and their associated confidence intervals, we propose a new method that allows the curve estimation uncertainty to be incorporated into the depth analysis. We describe the new approach using the modified band depth although any other functional depth could be used. The performance of the proposed methodology is illustrated using simulated curves in different settings where we control the degree of sparsity. Also a real data set consisting of female medflies egg-laying trajectories is considered. The results show the benefits of using uncertainty when computing depth for sparse functional data.

MSC:

62R10 Functional data analysis
62F07 Statistical ranking and selection procedures
62G35 Nonparametric robustness
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Arribas-Gil, A.; Romo, J., Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15, 4, 603-619 (2014)
[2] Azcorra, A.; Chiroque, LF; Cuevas, R.; Anta, AF; Laniado, H.; Lillo, RE; Romo, J.; Sguera, C., Unsupervised scalable statistical method for identifying influential users in online social networks, Sci Rep, 8, 1, 6955 (2018)
[3] Carey, JR; Liedo, P.; Müller, H-G; Wang, J-L; Chiou, J-M, Relationship of age patterns of fecundity to mortality, longevity, and lifetime reproduction in a large cohort of mediterranean fruit fly females, J Gerontol Ser A Biol Sci Med Sci, 53, 4, B245-B251 (1998)
[4] Chakraborty, A.; Chaudhuri, P., On data depth in infinite dimensional spaces, Ann Inst Stat Math, 66, 2, 303-324 (2014) · Zbl 1336.62123
[5] Chaudhuri, P., On a geometric notion of quantiles for multivariate data, J Am Stat Assoc, 91, 434, 862-872 (1996) · Zbl 0869.62040
[6] Cuesta-Albertos, JA; Febrero-Bande, M.; de la Fuente, MO, The ddg-classifier in the functional setting, Test, 26, 1, 119-142 (2017) · Zbl 1422.62216
[7] Cuesta-Albertos, JA; Nieto-Reyes, A., The random tukey depth, Comput Stat Data Anal, 52, 11, 4979-4988 (2008) · Zbl 1452.62344
[8] Cuevas, A.; Febrero, M.; Fraiman, R., Robust estimation and classification for functional data via projection-based depth notions, Comput Stat, 22, 3, 481-496 (2007) · Zbl 1195.62032
[9] Dai W, Genton MG(2017) An outlyingness matrix for multivariate functional data classification. arXiv preprint arXiv:1704.02568 · Zbl 1406.62068
[10] Flores, R.; Lillo, R.; Romo, J., Homogeneity test for functional data, J Appl Stat, 45, 5, 868-883 (2018)
[11] Fraiman, R.; Muniz, G., Trimmed means for functional data, Test, 10, 2, 419-440 (2001) · Zbl 1016.62026
[12] Gervini, D., Outlier detection and trimmed estimation for general functional data, Statistica Sinica, 22, 1639-1660 (2012) · Zbl 1253.62019
[13] Gijbels, I.; Nagy, S., On a general definition of depth for functional data, Stat Sci, 32, 4, 630-639 (2017) · Zbl 1381.62098
[14] Goldsmith, J.; Greven, S.; Crainiceanu, CM, Corrected confidence bands for functional data using principal components, Biometrics, 69, 1, 41-51 (2013) · Zbl 1274.62776
[15] Hubert, M.; Rousseeuw, PJ; Segaert, P., Multivariate functional outlier detection, Stat Methods Appl, 24, 2, 177-202 (2015) · Zbl 1441.62124
[16] Jörnsten, R., Clustering and classification based on the l1 data depth, J Multivar Anal, 90, 1, 67-89 (2004) · Zbl 1047.62064
[17] Koshevoy, G.; Mosler, K., Zonoid trimming for multivariate distributions, Ann Stat, 25, 5, 1998-2017 (1997) · Zbl 0881.62059
[18] Li, J.; Cuesta-Albertos, JA; Liu, RY, Dd-classifier: nonparametric classification procedure based on dd-plot, J Am Stat Assoc, 107, 498, 737-753 (2012) · Zbl 1261.62058
[19] Liu, RY, On a notion of data depth based on random simplices, Ann Stat, 18, 1, 405-414 (1990) · Zbl 0701.62063
[20] Liu, RY; Parelius, JM; Singh, K., Multivariate analysis by data depth: descriptive statistics, graphics and inference,(with discussion and a rejoinder by liu and singh), Ann Stat, 27, 3, 783-858 (1999) · Zbl 0984.62037
[21] Liu, RY; Singh, K., A quality index based on data depth and multivariate rank tests, J Am Stat Assoc, 88, 421, 252-260 (1993) · Zbl 0772.62031
[22] López-Pintado S, Jornsten R (2007) Functional analysis via extensions of the band depth. Lecture Notes-Monograph Series, pp 103-120
[23] López-Pintado, S.; Romo, J., Depth-based inference for functional data, Comput Stat Data Anal, 51, 10, 4957-4968 (2007) · Zbl 1162.62359
[24] López-Pintado, S.; Romo, J., On the concept of depth for functional data, J Am Stat Assoc, 104, 486, 718-734 (2009) · Zbl 1388.62139
[25] López-Pintado, S.; Romo, J., A half-region depth for functional data, Comput Stat Data Anal, 55, 4, 1679-1695 (2011) · Zbl 1328.62029
[26] López-Pintado S, Wei Y (2011) Depth for sparse functional data. In: Recent advances in functional data analysis and related topics, pp. 209-212. Springer, Berlin
[27] López-Pintado, S.; Wrobel, J., Robust non-parametric tests for imaging data based on data depth, Stat, 6, 1, 405-419 (2017)
[28] Mahalanobis, PC, On the generalized distance in statistics (1936), Banglore: National Institute of Science of India, Banglore · Zbl 0015.03302
[29] Mosler K, Polyakova Y (2012) General notions of depth for functional data. arXiv preprint arXiv:1208.1981
[30] Narisetty, NN; Nair, VN, Extremal depth for functional data and applications, J Am Stat Assoc, 111, 516, 1705-1714 (2016)
[31] Nieto-Reyes, A.; Battey, H., A topologically valid definition of depth for functional data, Stat Sci, 31, 61-79 (2016) · Zbl 1436.62720
[32] Oja, H., Descriptive statistics for multivariate distributions, Stat Probab Lett, 1, 6, 327-332 (1983) · Zbl 0517.62051
[33] Rousseeuw, PJ; Hubert, M., Regression depth, Journal of the American Statistical Association, 94, 446, 388-402 (1999) · Zbl 1007.62060
[34] Sguera, C.; Galeano, P.; Lillo, R., Spatial depth-based classification for functional data, Test, 23, 4, 725-750 (2014) · Zbl 1312.62083
[35] Sguera, C.; Galeano, P.; Lillo, RE, Functional outlier detection by a local depth with application to no x levels, Stoch Env Res Risk Assess, 30, 4, 1115-1130 (2016)
[36] Sun, Y.; Genton, MG, Functional boxplots, J Comput Gr Stat, 20, 2, 316-334 (2011)
[37] Sun, Y.; Genton, MG, Functional median polish, J Agric Biol Environ Stat, 17, 3, 354-376 (2012) · Zbl 1302.62281
[38] Tukey JW (1975) Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, Vancouver, 1975, Volume 2, pp. 523-531 · Zbl 0347.62002
[39] Vardi, Y.; Zhang, C-H, The multivariate l1-median and associated data depth, Proc Nat Acad Sci, 97, 4, 1423-1426 (2000) · Zbl 1054.62067
[40] Yao, F.; Müller, H-G; Wang, J-L, Functional data analysis for sparse longitudinal data, J Am Stat Assoc, 100, 470, 577-590 (2005) · Zbl 1117.62451
[41] Zhang, X.; Wang, J-L, From sparse to dense functional data and beyond, Ann Stat, 44, 5, 2281-2321 (2016) · Zbl 1349.62161
[42] Zuo, Y., Projection-based depth functions and associated medians, Ann Stat, 31, 5, 1460-1490 (2003) · Zbl 1046.62056
[43] Zuo, Y.; Serfling, R., General notions of statistical depth function, Ann Stat, 28, 461-482 (2000) · Zbl 1106.62334
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.