×

Functional data clustering by projection into latent generalized hyperbolic subspaces. (English) Zbl 07433036

Summary: We introduce a latent subpace model which facilitates model-based clustering of functional data. Flexible clustering is attained by imposing jointly generalized hyperbolic distributions on projections of basis expansion coefficients into group specific subspaces. The model acquires parsimony by assuming these subspaces are of relatively low dimension. Parameter estimation is done through a multicycle ECM algorithm. Application to simulated and real datasets illustrate competitive clustering capabilities, and demonstrate the models general applicability.

MSC:

62R10 Functional data analysis

Software:

funHDDC; QRM
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Baek, J.; McLachlan, GJ; Flack, LK, Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualization of high-dimensional data, IEEE Trans Pattern Anal Mach Intell, 32, 7, 1298-1309 (2010)
[2] Banfield, JD; Raftery, AE, Model-based gaussian and non-gaussian clustering, Biometrics, 49, 3, 803-821 (1993) · Zbl 0794.62034
[3] Bellman, R., The theory of dynamic programming, Bull Am Math Soc, 60, 6, 503-515 (1954) · Zbl 0057.12503
[4] Bickel, PJ; Levina, E., Regularized estimation of large covariance matrices, Ann Stat, 36, 1, 199-227 (2008) · Zbl 1132.62040
[5] Bouveyron, C.; Brunet, C., Model-based clustering of high-dimensional data: a review, Comput Stat Data Anal, 71, 52-78 (2013) · Zbl 1471.62032
[6] Bouveyron, C.; Jacques, J., Model-based clustering of time series in group-specific functional subspaces, Adv Data Anal Classif, 5, 4, 281-300 (2011) · Zbl 1274.62416
[7] Bouveyron, C.; Girard, S.; Schmid, C., High-dimensional data clustering, Comput Stat Data Anal, 52, 1, 502-519 (2007) · Zbl 1452.62433
[8] Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. Ann Appl Stat (in press). https://hal.archives-ouvertes.fr/hal-01024186 · Zbl 1397.62511
[9] Browne, RP; McNicholas, PD, A mixture of generalized hyperbolic distributions, Can J Stat, 43, 2, 176-198 (2015) · Zbl 1320.62144
[10] Celeux, G.; Govaert, G., Gaussian parsimonious clustering models, Pattern Recogn, 28, 5, 781-793 (1995)
[11] Dau HA, Keogh E, Kamgar K, Yeh CCM, Zhu Y, Gharghabi S, Ratanamahatana CA, Yanping, Hu B, Begum N, Bagnall A, Mueen A, Batista G, Hexagon-ML (2018) The ucr time series classification archive. https://www.cs.ucr.edu/ eamonn/time_series_data_2018/
[12] Ghahramani Z, Hinton GE (1997) The em algorithm for mixtures of factor analyzers. Technical report
[13] Hastie, T.; Buja, A.; Tibshirani, R., Penalized discriminant analysis, Ann Stat, 23, 1, 73-102 (1995) · Zbl 0821.62031
[14] Jacques, J.; Preda, C., Functional data clustering: a survey, Adv Data Anal Classif, 8, 3, 231-255 (2014) · Zbl 1414.62018
[15] Jacques, J.; Preda, C., Model-based clustering for multivariate functional data, Comput Stat Data Anal, 71, 92-106 (2014) · Zbl 1471.62096
[16] James, GM; Sugar, CA, Clustering for sparsely sampled functional data, J Am Stat Assoc, 98, 462, 397-408 (2003) · Zbl 1041.62052
[17] Kim, NH; Browne, R., Subspace clustering for the finite mixture of generalized hyperbolic distributions, Adv Data Anal Classif (2018) · Zbl 1474.62187
[18] Lin, Z.; Müller, HG; Yao, F., Mixture inner product spaces and their application to functional data analysis, Ann Statist, 46, 1, 370-400 (2018) · Zbl 1393.62029
[19] McLachlan, G.; Peel, D.; Bean, R., Modelling high-dimensional data by mixtures of factor analyzers, Comput Stat Data Anal, 41, 3, 379-388 (2003) · Zbl 1256.62036
[20] Mclachlan, G.; Bean, R.; Ben-Tovim Jones, L., Extension of the mixture of factor analyzers model to incorporate the multivariate distribution, Comput Stat Data Anal, 51, 5327-5338 (2007) · Zbl 1445.62053
[21] McNeil, AJ; Frey, R.; Embrechts, P., Quantitative risk management: concepts, techniques and tools (2015), Princeton, NJ: Princeton University Press, Princeton, NJ · Zbl 1337.91003
[22] Parsons, L.; Haque, E.; Liu, H., Subspace clustering for high dimensional data: A review, SIGKDD Explor Newsl, 6, 1, 90-105 (2004)
[23] Pesevski, A.; Franczak, B.; McNicholas, P., Subspace clustering with the multivariate-t distribution, Pattern Recogn Lett, 112, 1 (2017)
[24] Schmutz A, Jacques J, Bouveyron C, Cheze L, Martin P (2018) Clustering multivariate functional data in group-specific functional subspaces (working paper or preprint). https://hal.inria.fr/hal-01652467 · Zbl 07255804
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.