×

Robustness properties of \(S\)-estimators of multivariate location and shape in high dimension. (English) Zbl 0862.62049

Summary: For the problem of robust estimation of multivariate location and shape, defining \(S\)-estimators using scale transformations of a fixed \(\rho\) function regardless of the dimension, as is usually done, leads to a perverse outcome: estimators in high dimensions can have a breakdown point approaching 50%, but still fail to reject as outliers points that are large distances from the main mass of points. This leads to a form of nonrobustness that has important practical consequences.
In this paper, estimators are defined that improve on known \(S\)-estimators in having all of the following properties: (1) maximal breakdown for the given sample size and dimension; (2) ability completely to reject as outliers points that are far from the main mass of points; (3) convergence to good solutions with a modest amount of computation from a nonrobust starting point for large (though not near 50%) contamination.
However, to attain maximal breakdown, these estimates, like other known maximal breakdown estimators, require large amounts of computational effort. This greater ability of the new estimators to reject outliers comes at a modest cost in efficiency and gross error sensitivity and at a greater, but finite, cost in local shift sensitivity.

MSC:

62H12 Estimation in multivariate analysis
62F35 Robustness and adaptive procedures (parametric inference)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abramowitz, M. and Stegun, I. A. (1972). Handbook of Mathematical Functions. Dover, New York. Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H. and Tukey, J. W. · Zbl 0515.33001
[2] . Robust Estimates of Location: Survey and Advances. Princeton Univ. Press. · Zbl 0254.62001
[3] Campbell, N. A. (1980). Robust procedures in multivariate analysis I: robust covariance estimation. J. Roy. Statist. Soc. Ser. C 29 231-237. · Zbl 0471.62047 · doi:10.2307/2346896
[4] Campbell, N. A. (1982). Robust procedures in multivariate analysis I: robust canonical variate analysis. J. Roy. Statist. Soc. Ser. C 31 1-8. · Zbl 0497.62035 · doi:10.2307/2347068
[5] Davies, P. L. (1987). Asy mptotic behavior of S-estimators of multivariate location parameters and dispersion matrices. Ann. Statist. 15 1269-1292. · Zbl 0645.62057 · doi:10.1214/aos/1176350505
[6] Donoho, D. L. (1982). Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Dept. Statistics, Harvard Univ.
[7] Donoho, D. L. and Huber, P. J. (1983). The notion of breakdown point. In A Festschrift for Erich L. Lehmann (P. J. Bickell, K. A. Doksum and J. L. Hodges, eds.) 157-184. Wadsworth, Belmont, CA. · Zbl 0523.62032
[8] Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York. · Zbl 0593.62027
[9] Huber, P. J. (1981). Robust Statistics. Wiley, New York. · Zbl 0536.62025
[10] Huber, P. J. (1985). Projection pursuit. Ann. Statist. 13 435-475. · Zbl 0595.62059 · doi:10.1214/aos/1176349519
[11] Kent, J. T. and Ty ler, D. E. (1991). Redescending M-estimates of multivariate location and scatter. Ann. Statist. 19 2102-2119. · Zbl 0763.62030 · doi:10.1214/aos/1176348388
[12] Lopuhaä, H. P. (1989). On the relation between S-estimators and M-estimators of multivariate location and covariance. Ann. Statist. 17 1662-1683. · Zbl 0702.62031 · doi:10.1214/aos/1176347386
[13] Lopuhaä, H. P. and Rousseeuw, P. J. (1991). Breakdown points of affine equivariant estimators of multivariate location and covariance matrices. Ann. Statist. 19 229-248. · Zbl 0733.62058 · doi:10.1214/aos/1176347978
[14] Maronna, R. A. (1976). Robust M-estimators of multivariate location and scatter. Ann. Statist. 4 51-67. · Zbl 0322.62054 · doi:10.1214/aos/1176343347
[15] Rocke, D. M. (1993). On Mand S-estimators of multivariate location and shape. Unpublished manuscript.
[16] Rocke, D. M. and Woodruff, D. L. (1993). Computation of robust estimates of multivariate location and shape. Statist. Neerlandica 47 27-42. · doi:10.1111/j.1467-9574.1993.tb01404.x
[17] Rocke, D. M. and Woodruff, D. L. (1996). Identification of outliers in multvariate data. J. Amer. Statist. Assoc. To appear. JSTOR: · Zbl 0882.62049 · doi:10.2307/2291724
[18] Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. In Mathematical Statistics and Applications B (W. Grossmann, G. Pflug, I. Vincze and W. Werz, eds.) 283-297. Reidel, Dordrecht. · Zbl 0609.62054
[19] Rousseeuw, P. J. and Leroy, A. M. (1987). Robust Regression and Outlier Detection. Wiley, New York. · Zbl 0711.62030
[20] Rousseeuw, P. J. and Yohai, V. (1984). Robust regression by means of S-estimators. Robust and Nonlinear Time Series Analy sis. Lecture Notes in Statist. 26 256-272. Springer, Berlin. Rousseeuw, P. J. and van Zomeren, B. C. (1990a). Unmasking multivariate outliers and leverage points. J. Amer. Statist. Assoc. 85 633-639. Rousseeuw, P. J. and van Zomeren, B. C. (1990b). Rejoinder. J. Amer. Statist. Assoc. 85 648-651. · Zbl 0567.62027
[21] Rousseeuw, P. J. and van Zomeren, B. C. (1991). Robust distances: simulations and cutoff values. In Directions in Robust Statistics and Diagnostics 2 (W. Stahel and S. Weisberg, eds.) 195-203. Springer, New York.
[22] Stahel, W. A. (1981). Robuste Schätzungen: Infinitesimale Optimalität und Schätzungen von Kovarianzmatrizen. Ph.D. dissertation, ETH, Zurich. · Zbl 0531.62036
[23] Ty ler, D. E. (1983). Robustness and efficiency properties of scatter matrices. Biometrika 70 411- 420. JSTOR: · Zbl 0536.62042 · doi:10.1093/biomet/70.2.411
[24] Ty ler, D. E. (1988). Some results on the existence, uniqueness, and computation of the Mestimates of multivariate location and scatter. SIAM J. Sci. Statist. Comput. 9 354-362. · Zbl 0648.65098 · doi:10.1137/0909023
[25] Ty ler, D. E. (1991). Some issues in the robust estimation of multivariate location and scatter. In Directions in Robust Statistics and Diagnostics 2 (W. Stahel and S. Weisberg, eds.) 327-336. Springer, New York.
[26] Woodruff, D. L. and Rocke, D. M. (1993). Heuristic search algorithms for the minimum volume ellipsoid. J. Comput. Graphical Statist. 2 69-95.
[27] Woodruff, D. L. and Rocke, D. M. (1994). Computable robust estimation of multivariate location and shape in high dimension using compound estimators. J. Amer. Statist. Assoc. 89 888-896. JSTOR: · Zbl 0825.62485 · doi:10.2307/2290913
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.