×

Clustering via finite nonparametric ICA mixture models. (English) Zbl 1474.62264

Summary: We propose a novel extension of nonparametric multivariate finite mixture models by dropping the standard conditional independence assumption and incorporating the independent component analysis (ICA) structure instead. This innovation extends nonparametric mixture model estimation methods to situations in which conditional independence, a necessary assumption for the unique identifiability of the parameters in such models, is clearly violated. We formulate an objective function in terms of penalized smoothed Kullback-Leibler distance and introduce the nonlinear smoothed majorization-minimization independent component analysis algorithm for optimizing this function and estimating the model parameters. Our algorithm does not require any labeled observations a priori; it may be used for fully unsupervised clustering problems in a multivariate setting. We have implemented a practical version of this algorithm, which utilizes the FastICA algorithm, in the R package icamix. We illustrate this new methodology using several applications in unsupervised learning and image processing.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G07 Density estimation
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Aeberhard S, Coomans D, De Vel O (1992) Comparison of classifiers in high dimensional settings. Deparment of Mathematics and Statistics, James Cook University, North Queensland, Australia. Technical Report 92(02)
[2] Allman ES, Matias C, Rhodes JA (2009) Identifiability of parameters in latent structure models with many observed variables. Ann Stat 37(6A):3099-3132 · Zbl 1191.62003
[3] Anandkumar A, Hsu D, Kakade SM (2012) A method of moments for mixture models and hidden Markov models. In Mannor S, Srebro N, Williamson RC (eds) Proceedings of the 25th annual conference on learning theory, vol 23, pp 33.1-33.34. PMLR, Edinburgh, Scotland
[4] Azzalini A, Torelli N (2007) Clustering via nonparametric density estimation. Stat Comput 17(1):71-80
[5] Bajari P, Hahn J, Hong H, Ridder G (2011) A note on semiparametric estimation of finite mixtures of discrete choice models with application to game theoretic models. Int Econ Rev 52(3):807-824
[6] Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 803-821 · Zbl 0794.62034
[7] Benaglia T, Chauveau D, Hunter DR (2009) An EM-like algorithm for semi-and nonparametric estimation in multivariate mixtures. J Comput Graph Stat 18(2):505-526
[8] Benaglia, T.; Chauveau, D.; Hunter, DR; Hunter, DR (ed.); Richards, DSP (ed.); Rosenberger, JL (ed.), Bandwidth selection in an EM-like algorithm for nonparametric multivariate mixtures, 15-27 (2011), Singapore
[9] Bonhomme S, Jochmans K, Robin J-M (2016a) Estimating multivariate latent-structure models. Ann Stat 44(2):540-563 · Zbl 1381.62055
[10] Bonhomme S, Jochmans K, Robin J-M (2016b) Non-parametric estimation of finite mixtures from repeated measurements. J R Stat Soc Ser B (Stat Methodol) 78(1):211-229 · Zbl 1411.62079
[11] Butucea C, Vandekerkhove P (2014) Semiparametric mixtures of symmetric distributions. Scand J Stat 41(1):227-239 · Zbl 1349.62094
[12] Chauveau D, Hunter DR, Levine M (2015) Semi-parametric estimation for conditional independence multivariate finite mixture models. Stat Surv 9:1-31 · Zbl 1307.62090
[13] Cohen EA (1984) Some effects of inharmonic partials on interval perception. Music Percept Interdiscip J 1(3):323-349
[14] De Castro Y, Gassiat E, Lacour C (2016) Minimax adaptive estimation of nonparametric hidden Markov models. J Mach Learn Res 17(1):3842-3884 · Zbl 1419.62209
[15] De Veaux RD (1989) Mixtures of linear regressions. Comput Stat Data Anal 8(3):227-245 · Zbl 0726.62109
[16] Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1-38 · Zbl 0364.62022
[17] Eddelbuettel D (2013) Seamless R and C++ integration with Rcpp. Springer, New York · Zbl 1283.62001
[18] Eddelbuettel D, François R (2011) Rcpp: seamless R and C++ integration. J Stat Softw 40(8):1-18
[19] Eddelbuettel D, Sanderson C (2014) RcppArmadillo: accelerating R with high-performance C++ linear algebra. Comput Stat Data Anal 71:1054-1063 · Zbl 1471.62055
[20] Forina M, Leardi R, Armanino C, Lanteri S, Conti P, Princi P (1988) Parvus: an extendable package of programs for data exploration, classification and correlation. J Chemometr 4(2):191-193
[21] Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41(8):578-588 · Zbl 0920.68038
[22] Fraley C, Raftery AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97(458):611-631 · Zbl 1073.62545
[23] Frühwirth-Schnatter S (2006) Finite mixture and markov switching models. Springer Science & Business Media, LLC, New York · Zbl 1108.62002
[24] Gassiat E, Cleynen A, Robin S (2016) Inference in finite state space non parametric hidden Markov models and applications. Stat Comput 26(1):61-71 · Zbl 1342.62141
[25] Gassiat E, Rousseau J (2016) Nonparametric finite translation hidden Markov models and extensions. Bernoulli 22(1):193-212 · Zbl 1388.62243
[26] Guglielmi A, Ieva F, Paganoni AM, Ruggeri F, Soriano J (2014) Semiparametric Bayesian models for clustering and classification in the presence of unbalanced in-hospital survival. J R Stat Soc Ser C (Appl Stat) 63(1):25-46
[27] Hall P, Zhou X-H (2003) Nonparametric estimation of component distributions in a multivariate mixture. Ann Stat 31(1):201-224 · Zbl 1018.62021
[28] Han B, Davis LS (2006) Semi-parametric model-based clustering for DNA microarray data. In: 18th International conference on pattern recognition (ICPR’06), vol 3, pp 324-327
[29] Huang M, Li R, Wang S (2013) Nonparametric mixture of regression models. J Am Stat Assoc 108(503):929-941 · Zbl 06224977
[30] Hunter DR, Lange K (2004) A tutorial on MM algorithms. Am Stat 58(1):30-37
[31] Hunter DR, Young DS (2012) Semiparametric mixtures of regressions. J Nonparametr Stat 24(1):19-38 · Zbl 1241.62055
[32] Hyvarinen A, Karhunen J, Oja E (2002) Independent component analysis. Stud Inform Control 11(2):205-207
[33] Lee T-W, Lewicki MS, Sejnowski TJ (1999a) ICA mixture models for image processing. In: Sixth joint symposium on neural computation proceedings, pp 79-86
[34] Lee, T-W; Lewicki, MS; Sejnowski, TJ; Kearns, MJ (ed.); Solla, SA (ed.); Cohn, DA (ed.), Unsupervised classification with non-Gaussian mixture models using ICA, No. 11, 508-514 (1999), Cambridge
[35] Lee T-W, Lewicki MS, Sejnowski TJ (2000) ICA mixture models for unsupervised classification of non-Gaussian classes and automatic context switching in blind signal separation. IEEE Trans Pattern Anal Mach Intell 22(10):1078-1089
[36] Levine M, Hunter DR, Chauveau D (2011) Maximum smoothed likelihood for multivariate mixtures. Biometrika 98(2):403-416 · Zbl 1215.62055
[37] Li J, Ray S, Lindsay BG (2007) A nonparametric statistical approach to clustering via mode identification. J Mach Learn Res 8(8):1687-1723 · Zbl 1222.62076
[38] Mallapragada PK, Jin R, Jain A (2010) Non-parametric mixture models for clustering. In: Structural, syntactic, and statistical pattern recognition. Springer, Berlin, pp 334-343
[39] McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[40] Meng X-L, Rubin DB (1993) Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika 80(2):267-278 · Zbl 0778.62022
[41] Miettinen J, Taskinen S, Nordhausen K, Oja H (2015) Fourth moments and independent component analysis. Stat Sci 30(3):372-390 · Zbl 1332.62196
[42] Palmer JA, Makeig S, Kreutz-Delgado K, Rao BD (2008) Newton method for the ICA mixture model. In: Proceedings of the 2008 IEEE international conference on acoustics, speech, and signal processing, pp 1805-1808
[43] Peña D, Prieto FJ, Viladomat J (2010) Eigenvectors of a kurtosis matrix as interesting directions to reveal cluster structure. J Multivar Anal 101(9):1995-2007 · Zbl 1203.62114
[44] R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
[45] Raykar VC, Yang C, Duraiswami R, Gumerov N (2005) Fast computation of sums of Gaussians in high dimensions. Technical report, University of Maryland
[46] Salazar A, Igual J, Safont G, Vergara L, Vidal A (2015) Image applications of agglomerative clustering using mixtures of non-Gaussian distributions. In: Proceedings of the 2015 international conference on computational science and computational intelligence (CSCI), pp 459-463
[47] Salazar A, Vergara L, Serrano A, Igual J (2010) A general procedure for learning mixtures of independent component analyzers. Pattern Recognit 43(1):69-85 · Zbl 1191.68601
[48] Shah CA, Arora MK, Varshney PK (2004) Unsupervised classification of hyperspectral data: an ICA mixture model based approach. Int J Remote Sens 25(2):481-487
[49] Tyler DE, Critchley F, Dümbgen L, Oja H (2009) Invariant co-ordinate selection (with discussion). J R Stat Soc Ser B (Stat Methodol) 71(3):549-592 · Zbl 1250.62032
[50] Vandekerkhove P (2013) Estimation of a semiparametric mixture of regressions model. J Nonparametr Stat 25(1):181-208 · Zbl 1297.62076
[51] Vichi M (2008) Fitting semiparametric clustering models to dissimilarity data. Adv Data Anal Classif 2(2):121-161 · Zbl 1306.62147
[52] Viele K, Tong B (2002) Modeling with mixtures of linear regressions. Stat Comput 12(4):315-330
[53] Wolfe JH (1963) Object cluster analysis of social areas. Ph.D. thesis, University of California
[54] Zhang W, Fan J, Sun Y (2009) A semiparametric model for cluster data. Ann Stat 37(5A):2377-2408 · Zbl 1173.62030
[55] Zhu X, Hunter DR (2016) Theoretical grouding for estimation in conditional independence multivariate finite mixture models. J Nonparametr Stat 28(1):683-701 · Zbl 1407.62245
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.