×

Exploratory tools for clustering multivariate data. (English) Zbl 1452.62028

Summary: The forward search provides a series of robust parameter estimates based on increasing numbers of observations. The resulting series of robust Mahalanobis distances is used to cluster multivariate normal data. The method depends on envelopes of the distribution of the test statistics in forward plots. These envelopes can be found by simulation; flexible polynomial approximations to the envelopes are given. New graphical tools provide methods not only of detecting clusters but also of determining their membership. Comparisons are made with mclust and \(k\)-means clustering.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

MASS (R); Flury; R; mclust
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Anderberg, M. R., Cluster Analysis for Applications (1973), Academic Press: Academic Press New York · Zbl 0299.62029
[2] Atkinson, A. C., Fast very robust methods for the detection of multiple outliers, J. Amer. Statist. Assoc., 89, 1329-1339 (1994) · Zbl 0825.62429
[3] Atkinson, A. C.; Riani, M., Distribution theory and simulations for tests of outliers in regression, J. Comput. Graph. Statist., 15, 460-476 (2006)
[4] Atkinson, A.C., Riani, M., 2007. Discussion of paper by Handcock, Raftery and Tantrum. J. Roy. Statist. Soc. Ser. B 69 (In press).; Atkinson, A.C., Riani, M., 2007. Discussion of paper by Handcock, Raftery and Tantrum. J. Roy. Statist. Soc. Ser. B 69 (In press).
[5] Atkinson, A. C.; Riani, M.; Cerioli, A., Exploring Multivariate Data with the Forward Search (2004), Springer: Springer New York · Zbl 1049.62057
[6] Atkinson, A. C.; Riani, M.; Cerioli, A., Random start forward searches with envelopes for detecting clusters in multivariate data, (Zani, S.; Cerioli, A.; Riani, M.; Vichi, M., Data Analysis, Classification and the Forward Search (2006), Springer: Springer Berlin), 163-171
[7] Azzalini, A.; Bowman, A., A look at some data on the Old Faithful geyser, Appl. Statist., 39, 357-365 (1990) · Zbl 0707.62186
[8] Calinski, T.; Harabasz, J., A dendrite method for cluster analysis, Commun. Statist. — Theory Methods, 3, 1-27 (1974) · Zbl 0273.62010
[9] Flury, B., A First Course in Multivariate Statistics (1997), Springer: Springer New York · Zbl 0879.62052
[10] Flury, B.; Riedwyl, H., Multivariate Statistics: A Practical Approach (1988), Chapman & Hall: Chapman & Hall London
[11] Fraley, C.; Raftery, A. E., Enhanced model-based clustering, density estimation and discriminant analysis: MCLUST, J. Classification, 20, 263-286 (2003) · Zbl 1055.62071
[12] Fraley, C., Raftery, A.E., 2006.; Fraley, C., Raftery, A.E., 2006.
[13] Hadi, A. S., Identifying multiple outliers in multivariate data, J. Roy. Statist. Soc. Ser. B, 54, 761-771 (1992)
[14] McLachlan, G.; Peel, D., Finite Mixture Models (2000), Wiley: Wiley New York · Zbl 0963.62061
[15] Rousseeuw, P. J.; Leroy, A. M., Robust Regression and Outlier Detection (1987), Wiley: Wiley New York · Zbl 0711.62030
[16] Rousseeuw, P. J.; Van Driessen, K., A fast algorithm for the minimum covariance determinant estimator, Technometrics, 41, 212-223 (1999)
[17] Rousseeuw, P. J.; van Zomeren, B. C., Unmasking multivariate outliers and leverage points, J. Amer. Statist. Assoc., 85, 633-639 (1990)
[18] Venables, W. N.; Ripley, B. D., Modern Applied Statistics with S (2002), Springer: Springer New York · Zbl 1006.62003
[19] Zani, S.; Riani, M.; Corbellini, A., Robust bivariate boxplots and multiple outlier detection, Comput. Statist. Data Anal., 28, 257-270 (1998) · Zbl 1042.62545
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.