×

Data transformation technique to improve the outlier detection power of Grubbs’ test for data expected to follow linear relation. (English) Zbl 1435.62078

Summary: Grubbs test (extreme studentized deviate test, maximum normed residual test) is used in various fields to identify outliers in a data set, which are ranked in the order of \(x_1 \leq x_2 \leq x_3 \leq \cdots \leq x_n(i = 1,2, 3, \dots, n)\). However, ranking of data eliminates the actual sequence of a data series, which is an important factor for determining outliers in some cases (e.g., time series). Thus in such a data set, Grubbs test will not identify outliers correctly. This paper introduces a technique for transforming data from sequence bound linear form to sequence unbound form \((y = c)\). Applying Grubbs test to the new transformed data set detects outliers more accurately. In addition, the new technique improves the outlier detection capability of Grubbs test. Results show that, Grubbs test was capable of identifing outliers at significance level 0.01 after transformation, while it was unable to identify those prior to transforming at significance level 0.05.

MSC:

62F03 Parametric hypothesis testing
62F35 Robustness and adaptive procedures (parametric inference)

References:

[1] Grubbs, F. E., Sample criteria for testing outlying observations, The Annals of Mathematical Statistics, 21, 1, 27-58 (1950) · Zbl 0036.21003 · doi:10.1214/aoms/1177729885
[2] Grubbs, F. E., Procedures for detecting outlying observations in samples, Technometrics, 11, 1, 1-21 (1969) · doi:10.1080/00401706.1969.10490657
[3] Grubbs, F. E.; Beck, G., Extension of sample sizes and percentage points for significance tests of outlying observations, Technometrics, 14, 847-854 (1972)
[4] Thompson, M.; Lowthian, P. J., Notes on Statistics and Data Quality for Analytical Chemists (2011), Imperial College Press
[5] Geisser, S., Influential observations, diagnostics and discovery tests, Journal of Applied Statistics, 14, 2, 133-142 (1987) · doi:10.1080/02664768700000017
[6] Fung, W.-K., A statistical-test-complemented graphical method for detecting multiple outliers in two-way tables, Journal of Applied Statistics, 18, 2, 265-274 (1991) · doi:10.1080/02664769100000020
[7] Colosimo, B. M.; Pan, R.; del Castillo, E., A sequential Markov chain Monte Carlo approach to set-up adjustment of a process over a set of lots, Journal of Applied Statistics, 31, 5, 499-520 (2004) · Zbl 1121.62351 · doi:10.1080/02664760410001681765
[8] Solak, M. K., Detection of multiple outliers in univariate data sets, Paper, SP06-2009 (2009), Schering
[9] Jain, R. B., A recursive version of Grubbs’ test for detecting multiple outliers in environmental and chemical data, Clinical Biochemistry, 43, 12, 1030-1033 (2010) · doi:10.1016/j.clinbiochem.2010.04.071
[10] Rosner, B., On the detection of many outliers, Technometrics, 17, 221-227 (1975) · Zbl 0308.62025
[11] Rosner, B., Percentage points for a generalized ESD many-outlier procedure, Technometrics, 25, 2, 165-172 (1983) · Zbl 0536.62030 · doi:10.1080/00401706.1983.10487848
[12] Brant, R., Comparing classical and resistant outlier rules, Journal of the American Statistical Association, 85, 412, 1083-1090 (1990) · doi:10.1080/01621459.1990.10474979
[13] Xu, L.; Zhang, P.; Xu, J.; Wu, S.; Han, G.; Xu, D.; Zhang, W.; Chen, Z.; Douglas, C. C.; Tong, W., Conflict analysis of multi-source SST distribution, High Performance Computing and Applications, 479-484 (2010), Berlin, Germany: Springer, Berlin, Germany
[14] Srivastava, M. S., Effect of equicorrelation in detecting a spurious observation, The Canadian Journal of Statistics, 8, 2, 249-251 (1980) · Zbl 0469.62028 · doi:10.2307/3315236
[15] Young, D. M.; Pavur, R.; Marco, V. R., On the effect of correlation and unequal variances in detecting a spurious observation, The Canadian Journal of Statistics, 17, 1, 103-105 (1989) · Zbl 0678.62040 · doi:10.2307/3314767
[16] Baksalary, J. K.; Puntanen, S., A complete solution to the problem of robustness of Grubbs’s test, The Canadian Journal of Statistics, 18, 3, 285-287 (1990) · Zbl 0731.62082 · doi:10.2307/3315459
[17] Christie, O. H. J.; Alfsen, K. H., Data transformation as a means to obtain reliable consensus values for reference materials, Geostandards and Geoanalytical Research, 1, 1, 47-49 (1977) · doi:10.1111/j.1751-908x.1977.tb00857.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.