×

Outlier detection by means of robust regression estimators for use in engineering science. (English) Zbl 1178.62015

Summary: This study compares the ability of different robust regression estimators to detect and classify outliers. Well known estimators with high breakdown points were compared using simulated data. Mean success rates (MSR) were computed and used as comparison criteria. The results showed that the least median of squares (LMS) and least trimmed squares (LTS) were the most successful methods for data that included leverage points, masking and swamping effects or critical and concentrated outliers. We recommend using LMS and LTS as diagnostic tools to classify outliers, because they remain robust even when applied to models that are heavily contaminated or that have a complicated structure of outliers.

MSC:

62F10 Point estimation
62F35 Robustness and adaptive procedures (parametric inference)
62J20 Diagnostics, and linear inference and regression
62J05 Linear regression; mixed models
65C05 Monte Carlo methods
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Barnett, V., Lewis, T., 1994. Outliers in Statistical Data (3rd Ed.). John Wiley and Sons, New York. · Zbl 0801.62001
[2] Chen, C., Robust regression and outlier detection with the ROBUSTREG procedure, (2002), Cary, NC
[3] Daniel, C., Wood, F.S., 1971. Fitting Equations to Data. Wiley, New York. · Zbl 0264.65011
[4] Davies, P. L., Aspects of robust linear regression, Ann. Stat., 21, 1843-1899, (1993) · Zbl 0797.62026
[5] Davies, P. L.; Gather, U., Breakdown and groups with discussion and rejoinder, Ann. Stat., 33, 977-1035, (2005) · Zbl 1077.62041
[6] Donoho, D. L., Breakdown properties of multivariate location estimators, (1982), Boston
[7] Donoho, D. L.; Huber, P. J.; Bickel, P. J. (ed.); Doksum, K. (ed.); Hodges, J. L.J. (ed.), The notion of breakdown point, 157-184, (1983), Belmont
[8] Gather, U.; Hilker, T., A note on tyler’s modification of the MAD for the stahel-donoho estimator, Ann. Stat., 25, 2024-2026, (1997) · Zbl 0881.62033
[9] Hadi, A. S.; Simonoff, J. S., Procedures for the identification of multiple outliers in linear models, J. Am. Stat. Assoc., 88, 1264-1272, (1993)
[10] Hampel, F. R., Contributions to the theory of robust estimation, (1968), Berkeley
[11] Hampel, F. R., A general qualitative definition of robustness, Ann. Math. Stat., 42, 1887-1896, (1971) · Zbl 0229.62041
[12] Hampel, F. R., Beyond location parameters: robust concepts and methods (with discussion), Bull. Inst. Int. Stat., 46, 375-391, (1975) · Zbl 0349.62029
[13] Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.R., Shatel, W.A., 1986. Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.
[14] Hekimoglu, S., Finite sample breakdown points of outlier detection procedures, ASCE J. Surv. Eng., 123, 15-31, (1997)
[15] Hekimoglu, S., Do robust methods identify outliers more reliably than conventional test for outlier?, Zeitschrift für Vermessungwesen, 3, 174-180, (2005)
[16] Hekimoglu, S., Koch, K.R., 1999. How Can Reliability of the Robust Methods Be Measured? In: Altan, M.O., Gründig, L. (Eds.), Third Turkish-German Joint Geodetic Days, 1:179-196.
[17] Hekimoglu, S.; Erenoglu, R. C., Estimation of parameters for linear regression using Median estimator, 26, (2005), Finland · Zbl 1148.86305
[18] Hekimoglu, S.; Erenoglu, R. C., Effect of heteroscedasticity and heterogeneousness on outlier detection for geodetic networks, J. Geod., 81, 137-148, (2007) · Zbl 1148.86305
[19] Huber, P.J., 1981. Robust Statistics. John Wiley and Sons, New York. · Zbl 0536.62025
[20] Kamgar-Parsi, B.; Netanyahu, N. S., A nonparametric method for Fitting a straight line to a noisy image, IEEE Trans. Pattern Anal. Mach. Intell., 11, 998-1001, (1989)
[21] Lopuhaa, H. P.; Rousseeuw, P. J., Breakdown points of affine equivariant estimators of multivariate location and covariance matrices, Ann. Stat., 19, 229-248, (1991) · Zbl 0733.62058
[22] Rousseeuw, P. J., Least Median of squares regression, J. Am. Stat. Assoc., 79, 871-880, (1984) · Zbl 0547.62046
[23] Rousseeuw, P. J.; Grossman, W. (ed.); Pflug, G. (ed.); Vincze, I. (ed.); Werz, W. (ed.), Multivariate estimation with high breakdown point, 283-297, (1985), Dordrecht
[24] Rousseeuw, P.J., Leroy, A.M., 1987. Robust Regression and Outlier Detection. John Wiley and Sons, New York. · Zbl 0711.62030
[25] Sen, P. K., Estimates of the regression coefficient based on kendall’s tau, J. Am. Stat. Assoc., 63, 1379-1389, (1968) · Zbl 0167.47202
[26] Shevlyakov, G.L., Vilchevski, N.O., 2001. Robustness in Data Analysis: Criteria and Methods. VSP International Science Publishers, Utrecht.
[27] Siegel, A. F., Robust regression using repeated medians, Biometrika, 69, 242-244, (1982) · Zbl 0483.62026
[28] Stahel, W. A., Breakdown of covariance estimators, (1981), Zurich
[29] Staudte, R.G., Sheather, S.J., 1990. Robust Estimation and Testing. Wiley, New York. · Zbl 0706.62037
[30] Stromberg, A. J., Computing the exact least Median of squares estimate and stability diagnostics in multiple linear regression, SIAM J. Sci. Comput., 14, 1289-1299, (1993) · Zbl 0788.65144
[31] Theil, H., A rank-invariant method of linear and polynomial regression analysis, Nederlandse Akademie Wetenchappen Series A, 53, 386-392, (1950) · Zbl 0036.21601
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.