×

zbMATH — the first resource for mathematics

A novel spatial outlier detection technique. (English) Zbl 06865484
Summary: Spatial outliers are spatially referenced objects whose non spatial attribute values are significantly different from the corresponding values in their spatial neighborhoods. In other words, a spatial outlier is a local instability or an extreme observation that deviates significantly in its spatial neighborhood, but possibly not be in the entire dataset. In this article, we have proposed a novel spatial outlier detection algorithm, location quotient (LQ) for multiple attributes spatial datasets, and compared its performance with the well-known mean and median algorithms for multiple attributes spatial datasets, in the literature. In particular, we have applied the mean, median, and LQ algorithms on a real dataset and on simulated spatial datasets of 13 different sizes to compare their performances. In addition, we have calculated area under the curve values in all the cases, which shows that our proposed algorithm is more powerful than the mean and median algorithms in almost all the considered cases and also plotted receiver operating characteristic curves in some cases.
MSC:
62H11 Directional data; spatial statistics
62H15 Hypothesis testing in multivariate analysis
Software:
R; ROCR
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Anselin, L. 1988. Spatial econometrics: Methods and models. Studies in Operational Regional Science. Netherlands: Springer.
[2] Barnett, V., and T. Lewis. 1994. Outliers in statistical data. New York: John Wiley. · Zbl 0801.62001
[3] Chandola, V., A. Banerjee, and V. Kumar. 2009. Anomaly detection: A survey. ACM Computing Survey 41 (3):1-58.
[4] Chen, D., C.-T. Lu, Y. Kou, and F. Chen. 2008. On detecting spatial outliers. Geoinformatica 12:455-75.
[5] Fawett, T. 2006. An introduction to ROC analysis. Pattern Recognition Letters 27:861-74.
[6] Hanley, J. A., and B. J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143 (1):29-36.
[7] Hawkins, D. 1980. Identification of outliers. London: Chapman and Hall. · Zbl 0438.62022
[8] Miller, M. M., L. J. Gibson, and N. G. Wright. 1991. Location quotient: A basic tool for economic development analysis. Economic Development Review 9 (2):65-8.
[9] Klosterman, R. E., R. K. Brail, and E. G. Bossard. 1993. Spreadsheet models for urban and regional analysis. New Brunswick, NJ: Centre for Urban Policy Research.
[10] Lu, C. T., D. Chen, and Y. Kou. 2003. Detecting spatial outliers with multiple attributes. In Proceedings of the 15 th international conference on tools with artificial intelligence, ed. Bob Werner, 122-8. Sacramento, California, United States.
[11] McClish, D. K. 1989. Analyzing a portion of the ROC curve. Medical Decision Making 9:190-5.
[12] Ministry of Home affairs. http://mha.nic.in
[13] R Core, Team. 2016. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org.
[14] Shekhar, S., C. T. Lu, and P. Zhang. 2003. A unified approach to detecting spatial outliers. Geoinformatica 7 (2):139-66.
[15] Sing, T., O. Sander, N. Beerenwinkel, and T. Lenqauer. 2005. ROCR: Visualizing classifier performance in R. Bioinformatics 21 (20):3940-1.
[16] Song, X., J. Wang, W. Huang, L. Liu, G. Yan, and R. Pu. 2009. The delineation of agricultural management zones with high resolution remotely sensed data. Precision agriculture 10:471-87.
[17] Su, Peter Chu. 2011. Statistical geocomputing: Spatial outlier detection in precision agriculture. Canada: University of Waterloo.
[18] Worboys, M. F. 1995. GIS- A computing perspective. London: Taylor and Francis.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.