Quantile-based clustering. (English) Zbl 1434.62121
Summary: A new cluster analysis method, \(K\)-quantiles clustering, is introduced. \(K\)-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd’s algorithm for \(K\)-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although \(K\)-quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of \(K\)-quantiles clustering is proved, and it is shown that \(K\)-quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, \(K\)-quantiles clustering is compared with a number of popular clustering methods with good results. A high-dimensional microarray dataset is clustered by \(K\)-quantiles.
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G08 Nonparametric regression and quantile regression
