A central limit theorem for multivariate generalized trimmed \(k\)-means. (English) Zbl 0984.62042

Summary: A central limit theorem for generalized trimmed \(k\)-means is obtained in a very general framework that covers the multivariate setting, general penalty functions and general \(k\geq 1\). Several applications, including the location estimator case \((k=1)\) for elliptical distributions and the construction of multivariate (not necessarily connected) tolerance zones, are also given.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
60F05 Central limit and other weak theorems
62G35 Nonparametric robustness
62G15 Nonparametric tolerance and confidence regions
Full Text: DOI


[1] Baddeley, A. (1977). Integrals of a moving manifold and geometrical probability. Adv. in Appl. Probab. 9 588-603. JSTOR: · Zbl 0377.60013
[2] Butler, R. W. (1982). Nonparametric interval and point prediction using data trimmed by a Grubbs-type outlier rule. Ann. Statist. 10 197-204. · Zbl 0487.62040
[3] Butler, R. W., Davies, P. L. and Jhun, M. (1993). Asymptotics for the minimum covarince determinant estimator. Ann. Statist. 21 1385-1400. · Zbl 0797.62044
[4] Cuesta-Albertos, J. A., Gordaliza, A. and Matrán, C. (1997). Trimmed k-means: an attempt to robustify quantizers. Ann. Statist. 25 553-576. · Zbl 0878.62045
[5] Cuesta-Albertos, J. A., Gordaliza, A. and Matrán, C. (1998). Trimmed best k-nets: a robustified version of a L -based clustering method. Statist. Probab. Lett. 36 401-413. · Zbl 0894.62078
[6] Cuevas, A. and Fraiman, R. (1997). A plug-in approach to support estimation. Ann. Statist. 25 2300-2312. · Zbl 0897.62034
[7] Davies, P.L. (1987). Asymptotic behaviour of S-estimates of multivariate location parameters and dispersion matrices. Ann. Statist. 15 1269-1292. · Zbl 0645.62057
[8] Fleischer, P. (1964). Sufficient conditions for achieving minimum distorsion in a quantizer. IEEE Int. Conv. Rec. 104-111.
[9] García-Escudero, L. A. and Gordaliza, A. (1999). Robustness properties of k-means and trimmed k-means. J. Amer. Statist. Assoc. To appear. JSTOR: · Zbl 1072.62547
[10] García-Escudero, L. A., Gordaliza, A. and Matrán, C. (1997). Asymptotics for trimmed kmeans and associated tolerance zones. J. Statist. Plann. Inference 77 247-262. Gordaliza, A. (1991a). Best approximations to random variables based on trimming procedures. J. Approx. Theory 64 162-180. Gordaliza, A. (1991b). On the breakdown point of multivariate location estimators based on trimming procedures. Statist. Probab. Lett. 11 387-394. · Zbl 1054.62551
[11] Hartigan, J. A. (1978). Asymptotic distribution for clustering criteria. Ann. Statist. 6 117-131. · Zbl 0377.62033
[12] H össjer, O. (1994). Rank-based estimates in the linear model with high breakdown point. J. Amer. Statist. Assoc. 89 149-158. JSTOR: · Zbl 0795.62062
[13] Huber, P. J. (1967). The behavior of maximum likelihood estimators under non-standard conditions. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 221-233. Univ. California press, Berkeley. · Zbl 0212.21504
[14] Hyndman, R. J. (1996). Computing and graphing highest density regions. Amer. Statist. 50 120- 126.
[15] Kim, J. and Pollard, D. (1990). Cube root asymptotics. Ann. Statist. 18 191-219. · Zbl 0703.62063
[16] Li, L. and Flury, B. (1995). Uniqueness of principal points for univariate distributions. Statist. Probab. Lett. 25 323-327. · Zbl 0837.62017
[17] Mili, M. and Coakley, C. (1996). Robust estimation in structured linear regression. Ann. Statist. 24 2593-2607. · Zbl 0867.62040
[18] Pollard, D. (1981). Strong consistency of k-means clustering. Ann. Statist. 9 135-140. · Zbl 0451.62048
[19] Pollard, D. (1982). A central limit theorem for k-means clustering. Ann. Probab. 10 919-926. · Zbl 0502.62055
[20] Rousseeuw, P. J. (1983). Multivariate estimation with high breakdown point. In Proceedings of the Fourth Pannonian Symposium on Mathematical Statistics (W. Grossman, G. Plufg, I. Vincze and W. Werttz, eds.) B 283-297. Reidel, Dordrecht. · Zbl 0609.62054
[21] Rousseeuw, P. J. (1984). Least median of squares regression. J. Amer. Statist. Assoc. 79 871-880. JSTOR: · Zbl 0551.62049
[22] Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. Wiley, New York. · Zbl 0711.62030
[23] Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley, New York. · Zbl 0850.62006
[24] Serinko, R. J. and Babu, G. J. (1992). Weak limit theorems for univariate k-means clustering inder nonregular conditions. J. Multivariate Anal. 49 188-203. · Zbl 0753.60026
[25] Stute, W. and Zhu, L. X. (1995). Asymptotics of k-means clustering based on projection pursuit. Sankhy\?a 57 462-471. · Zbl 0857.62064
[26] Tableman, M. (1994). The asymptotics of the least trimmed absolute deviation (LTAD) estimator. Statist. Probab. Lett. 19 387-398. · Zbl 0797.62029
[27] Van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Wiley, New York. · Zbl 0862.60002
[28] Vandev, D. L. and Neykov, N. M. (1993). Robust maximum likelihood in the Gaussian case. In New Directions in Statistical Data Analysis and Robustness (S. Morgenthaler, E. Ronchetti and W. A. Stahel, eds.). Birkhäuser, Basel. · Zbl 0819.62049
[29] Yohai, V. and Maronna, R. (1976). Location estimators based on linear combinations of modified order statistics. Comm. Statist. Theory Methods 5 481-486. · Zbl 0337.62032
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.