×

Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. (English) Zbl 1326.62111

Summary: Multivariate location and scatter matrix estimation is a cornerstone in multivariate data analysis. We consider this problem when the data may contain independent cellwise and casewise outliers. Flat data sets with a large number of variables and a relatively small number of cases are common place in modern statistical applications. In these cases, global down-weighting of an entire case, as performed by traditional robust procedures, may lead to poor results. We highlight the need for a new generation of robust estimators that can efficiently deal with cellwise outliers and at the same time show good performance under casewise outliers.

MSC:

62G35 Nonparametric robustness
62G05 Nonparametric estimation

Software:

robustbase
PDFBibTeX XMLCite
Full Text: DOI arXiv Link

References:

[1] Alqallaf F, Van Aelst S, Yohai VJ, Zamar RH (2009) Propagation of outliers in multivariate data. Ann Stat 37(1):311-331 · Zbl 1155.62043 · doi:10.1214/07-AOS588
[2] Alqallaf FA, Konis KP, Martin RD, Zamar RH (2002) Scalable robust covariance and correlation estimates for data mining. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’02, pp 14-23. doi:10.1145/775047.775050
[3] Danilov M (2010) Robust estimation of multivariate scatter under non-affine equivarint scenarios. Dissertation, University of British Columbia
[4] Danilov M, Yohai VJ, Zamar RH (2012) Robust estimation of multivariate location and scatter in the presence of missing data. J Am Stat Assoc 107:1178-1186 · Zbl 1443.62147 · doi:10.1080/01621459.2012.699792
[5] Davies P (1987) Asymptotic behaviour of S-estimators of multivariate location parameters and dispersion matrices. Ann Stat 15:1269-1292 · Zbl 0645.62057 · doi:10.1214/aos/1176350505
[6] Donoho DL (1982) Breakdown properties of multivariate location estimators. Dissertation, Harvard University
[7] Farcomeni A (2014) Robust constrained clustering in presence of entry-wise outliers. Technometrics 56:102-111 · doi:10.1080/00401706.2013.826148
[8] Gervini D, Yohai VJ (2002) A class of robust and fully efficient regression estimators. Ann Stat 30(2):583-616 · Zbl 1012.62073 · doi:10.1214/aos/1021379866
[9] Huber PJ, Ronchetti EM (1981) Robust statistics, 2nd edn. Wiley, New Jersey · Zbl 0536.62025 · doi:10.1002/0471725250
[10] Hubert M, Rousseeuw PJ, Vakili K (2014) Shape bias of robust covariance estimators: an empirical study. Stat Pap 55:15-28 · Zbl 1283.62116 · doi:10.1007/s00362-013-0544-8
[11] Maronna RA, Martin RD, Yohai VJ (2006) Robust statistic: theory and methods. Wiley, Chichister · Zbl 1094.62040 · doi:10.1002/0470010940
[12] Rousseeuw PJ (1985) Multivariate estimation with high breakdown point. In: Grossmann W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications, vol B. Reidel Publishing Company, Dordrecht, pp 256-272 · Zbl 0609.62054
[13] Rousseeuw PJ, Van Driessen K (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41:212-223 · doi:10.1080/00401706.1999.10485670
[14] Salibian-Barrera M, Yohai VJ (2006) A fast algorithm for S-regression estimates. J Comput Gr Stat 15(2):414-427 · doi:10.1198/106186006X113629
[15] Smith RE, Campbell NA, Licheld A (1984) Multivariate statistical techniques applied to pisolitic laterite geochemistry at Golden Grove, Western Australia. J Geochem Explor 22:193-216 · doi:10.1016/0375-6742(84)90012-8
[16] Stahel WA (1981) Breakdown of covariance estimators. Tech. Rep. 31, Fachgruppe für Statistik, ETH Zürich, Switzerland · Zbl 1155.62043
[17] Stahel WA, Maechler M (2009) Comment on “invariant co-ordinate selection”. J R Stat Soc Ser B Stat Methodol 71:584-586
[18] Tatsuoka KS, Tyler DE (2000) On the uniqueness of S-functionals and M-functionals under nonelliptical distributions. Ann Stat 28:1219-1243 · Zbl 1105.62347 · doi:10.1214/aos/1015956714
[19] Van Aelst S, Vandervieren E, Willems G (2012) A Stahel-Donoho estimator based on huberized outlyingness. Comput Stat Data Anal 56:531-542 · doi:10.1016/j.csda.2011.08.014
[20] Yohai VJ (1985) High breakdown point and high efficiency robust estimates for regression. Tech. Rep. 66, Department of Statistics, University of Washington. Available: http://www.stat.washington.edu/research/reports/1985/tr066.pdf · Zbl 0624.62037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.