×

Finding an unknown number of multivariate outliers. (English) Zbl 1248.62091

Summary: We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples.

MSC:

62H15 Hypothesis testing in multivariate analysis
62H10 Multivariate distribution of statistics
62F35 Robustness and adaptive procedures (parametric inference)
62G30 Order statistics; empirical distribution functions

Software:

robustbase
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Atkinson, Distribution theory and simulations for tests of outliers in regression, J. Computnl Graph. Statist. 15 pp 460– (2006)
[2] Atkinson, Exploratory tools for clustering multivariate data, Computnl Statist. Data Anal. 52 pp 272– (2007)
[3] Atkinson, Exploring Multivariate Data with the Forward Search (2004) · Zbl 1049.62057 · doi:10.1007/978-0-387-21840-3
[4] Becker, The masking breakdown point of multivariate outlier identification rules, J. Am. Statist. Ass. 94 pp 947– (1999) · Zbl 1072.62600
[5] Benjamini, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Statist. Soc. B 57 pp 289– (1995) · Zbl 0809.62014
[6] Butler, Asymptotics for the minimum covariance determinant estimator, Ann. Statist. 21 pp 1385– (1993) · Zbl 0797.62044
[7] Casella, Statistical Inference (2002)
[8] Clarke, An adaptive trimmed likelihood algorithm for identification of multivariate outliers, Aust. New Zeal. J. Statist. 48 pp 353– (2006)
[9] Cook, Comment on Rousseeuw and van Zomeren (1990), J. Am. Statist. Ass. 85 pp 640– (1990)
[10] Cox, Theoretical Statistics (1974) · doi:10.1007/978-1-4899-2887-0
[11] Croux, Influence function and efficiency of the minimum covariance determinant scatter matrix estimator, J. Multiv. Anal. 71 pp 161– (1999) · Zbl 0946.62055
[12] Croux, Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies, Biometrika 87 pp 603– (2000) · Zbl 0956.62047
[13] Davies, The asymptotics of Rousseeuw’s minimum volume ellipsoid estimator, Ann. Statist. 20 pp 1828– (1992) · Zbl 0764.62046
[14] Flury, Multivariate Statistics: a Practical Approach (1988) · doi:10.1007/978-94-009-1217-5
[15] García-Escudero, Generalized radius processes for elliptically contoured distributions, J. Am. Statist. Ass. 100 pp 1036– (2005) · Zbl 1117.62339
[16] Guenther, An easy method for obtaining percentage points of order statistics, Technometrics 19 pp 319– (1977) · Zbl 0371.62069
[17] Hadi, Identifying multiple outliers in multivariate data, J. R. Statist. Soc. 54 pp 761– (1992)
[18] Hadi, A modification of a method for the detection of outliers in multivariate samples, J. R. Statist. Soc. B 56 pp 393– (1994) · Zbl 0800.62347
[19] Hadi, Procedures for the identification of multiple outliers in linear models, J. Am. Statist. Ass. 88 pp 1264– (1993)
[20] Hardin, The distribution of robust distances, J. Computnl Graph. Statist. 14 pp 910– (2005)
[21] Johnson, Continuous Univariate Distributions (1994)
[22] Lopuhaä, Asymptotics of reweighted estimators of multivariate location and scatter, Ann. Statist. 27 pp 1638– (1999) · Zbl 0957.62017
[23] Peña, Multivariate outlier detection and robust covariance matrix estimation (with discussion), Technometrics 43 pp 286– (2001)
[24] Pison, Small sample corrections for LTS and MCD, Metrika 55 pp 111– (2002)
[25] Riani (2007)
[26] Rousseeuw, Robust Regression and Outlier Detection (1987)
[27] Rousseeuw, A fast algorithm for the minimum covariance determinant estimator, Technometrics 41 pp 212– (1999)
[28] Rousseeuw, Unmasking multivariate outliers and leverage points, J. Am. Statist. Ass. 85 pp 633– (1990)
[29] Schwager, Detection of multivariate normal outliers, Ann. Statist. 10 pp 943– (1982) · Zbl 0497.62046
[30] Stuart, Kendall’s Advanced Theory of Statistics (1987)
[31] Tallis, Elliptical and radial truncation in normal samples, Ann. Math. Statist. 34 pp 940– (1963) · Zbl 0142.16104
[32] Wilks, Multivariate statistical outliers, Sankhya A 25 pp 407– (1963) · Zbl 0128.13401
[33] Wisnowski, A comparative analysis of multiple outlier detection procedures in the linear regression model, Computnl Statist. Data Anal. 36 pp 351– (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.