×

Some results on the spatial breakdown point of robust point estimates of the variogram. (English) Zbl 1172.86003

Summary: The effect of outliers on estimates of the variogram depends on how they are distributed in space. The ‘spatial breakdown point’ is the largest proportion of observations which can be drawn from some arbitrary contaminating process without destroying a robust variogram estimator, when they are arranged in the most damaging spatial pattern. A numerical method is presented to find the spatial breakdown point for any sample array in two dimensions or more. It is shown by means of some examples that such a numerical approach is needed to determine the spatial breakdown point for two or more dimensions, even on a regular square sample grid, since previous conjectures about the spatial breakdown point in two dimensions do not hold. The ‘average spatial breakdown point’ has been used as a basis for practical guidelines on the intensity of contaminating processes that can be tolerated by robust variogram estimators. It is the largest proportion of contaminating observations in a data set such that the breakdown point of the variance estimator used to obtain point estimates of the variogram is not exceeded by the expected proportion of contaminated pairs of observations over any lag. In this paper the behaviour of the average spatial breakdown point is investigated for cases where the contaminating process is spatially dependent. It is shown that in two dimensions the average spatial breakdown point is 0.25. Finally, the ‘empirical spatial breakdown point’, a tool for the exploratory analysis of spatial data thought to contain outliers, is introduced and demonstrated using data on metal content in the soils of Sheffield, England. The empirical spatial breakdown point of a particular data set can be used to indicate whether the distribution of possible contaminants is likely to undermine a robust variogram estimator.

MSC:

86A32 Geostatistics
62F10 Point estimation

Software:

GenStat
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aarts E, Korst J (1989) Simulated annealing and Boltzmann machines–a stochastic approach to combinatorial optimization and neural computing. Wiley, New York, 200 p · Zbl 0674.90059
[2] Baird DB (2007) MPOLISH. In: Payne RW (ed) Genstat procedure library PL18. VSN International, Hemel Hempstead
[3] Chilès J-P, Delfiner P (1999) Geostatistics, modeling spatial uncertainty. Wiley, New York, 695 p · Zbl 0922.62098
[4] Cressie N (1986) Kriging non-stationary data. J Am Stat Assoc 81(2):625–634 · Zbl 0625.62086 · doi:10.2307/2288990
[5] Cressie NAC (1993) Statistics for Spatial Data, revised edn. Wiley, New York, 900 p
[6] Cressie N, Hawkins D (1980) Robust estimation of the variogram. J Int Assoc Math Geol 12(2):115–125 · doi:10.1007/BF01035243
[7] Genton MG (1998a) Highly robust variogram estimation. Math Geol 30(2):213–221 · Zbl 0970.86002 · doi:10.1023/A:1021728614555
[8] Genton MG (1998b) Spatial breakdown point of variogram estimators. Math Geol 30(7):853–871 · Zbl 0970.86009 · doi:10.1023/A:1021778626251
[9] Hampel FR, Ronchetti EM, Rouseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New York, 502 p
[10] Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic Press, London, 600 p
[11] Journel AG, Posa D (1990) Characteristic behavior and order relations for indicator variograms. Math Geol 22(8):1011–1025 · Zbl 0964.86503 · doi:10.1007/BF00890121
[12] Kirkpatrick S, Gellat CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680 · Zbl 1225.90162 · doi:10.1126/science.220.4598.671
[13] Lark RM (2000) A comparison of some robust estimators of the variogram for use in soil survey. Eur J Soil Sci 51(1):137–157 · doi:10.1046/j.1365-2389.2000.00280.x
[14] Lark RM, Papritz A (2003) Fitting a linear model of coregionalization for soil properties using simulated annealing. Geoderma 115(3–4):245–260 · doi:10.1016/S0016-7061(03)00065-X
[15] Marchant BP, Lark RM (2007) Robust estimation of the variogram by residual maximum likelihood. Geoderma 140(1–2):62–72 · doi:10.1016/j.geoderma.2007.03.005
[16] Matheron G (1962) Traité de geostatistique appliqué, Tome 1. Memoires du bureau de recherches geologiques et minières, Paris, 334 p
[17] Omre H (1984) The variogram and its estimation. In: Verly G, David M, Journel AG, Marechal A (eds) Geostatistics for natural resources characterization, Part 1. Reidel, Dordrecht, pp 107–125
[18] Rawlins BG, Lark RM, O’Donnell KE, Tye A, Lister TR (2005) The assessment of point and diffuse soil pollution from an urban geochemical survey of Sheffield, England. Soil Use Manag 21(4):353–362 · doi:10.1079/SUM2005335
[19] Rawlins BG, Lark RM, Webster R, O’Donnell KE (2006) Historic metal deposition from atmospheric smelter emissions on Humberside, UK: 1. Magnitude and extent of contamination based on soil survey data. Environ Pollut 143(3):416–426 · doi:10.1016/j.envpol.2005.12.010
[20] Rousseeuw PJ (1985) Multivariate estimation with high breakdown point. In: Grossmann W, Pflug G, Vincze I, Wertz W (eds) Proceedings of the 4th Pannonian symposium on mathematical statistics and probability. Kluwer Academic, Dordrecht, pp 283–297
[21] Rousseeuw PJ, Croux C (1992) Explicit scale estimators with high breakdown point. In: Dodge Y (ed) L1 statistical analysis and related methods. North-Holland, Amsterdam, pp 77–92
[22] Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273–1283 · Zbl 0792.62025 · doi:10.2307/2291267
[23] Saby N, Arrouays D, Boulonne L, Jolivet C, Pochot A (2006) Geostatistical assessment of Pb in soil around Paris, France. Sci Total Environ 367(1):212–221 · doi:10.1016/j.scitotenv.2005.11.028
[24] Stein ML (1999) Interpolation of Spatial Data: Some Theory for Kriging. Springer, New York, 247 p · Zbl 0924.62100
[25] Warrick AW, Myers DE (1987) Optimization of sampling locations for variogram calculations. Water Resour Res 23(3):496–500 · doi:10.1029/WR023i003p00496
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.