Bayesian spectral modeling for multivariate spatial distributions of elemental concentrations in soil. (English) Zbl 1462.62709

Summary: Recent technological advances have enabled researchers in a variety of fields to collect accurately geocoded data for several variables simultaneously. In many cases it may be most appropriate to jointly model these multivariate spatial processes without constraints on their conditional relationships. When data have been collected on a regular lattice, the multivariate conditionally autoregressive (MCAR) models are a common choice. However, inference from these MCAR models relies heavily on the pre-specified neighborhood structure and often assumes a separable covariance structure. Here, we present a multivariate spatial model using a spectral analysis approach that enables inference on the conditional relationships between the variables that does not rely on a pre-specified neighborhood structure, is non-separable, and is computationally efficient. Covariance and cross-covariance functions are defined in the spectral domain to obtain computational efficiency. The resulting pseudo posterior inference on the correlation matrix allows for quantification of the conditional dependencies. A comparison is made with an MCAR model that is shown to be highly sensitive to the choice of neighborhood. The approaches are illustrated for the toxic element arsenic and four other soil elements whose relative concentrations were measured on a microscale spatial lattice. Understanding conditional relationships between arsenic and other soil elements provides insights for mitigating pervasive arsenic poisoning in drinking water in southern Asia and elsewhere.


62P12 Applications of statistics to environmental and related topics
62H11 Directional data; spatial statistics
62F15 Bayesian inference


BayesDA; spBayes
Full Text: DOI arXiv Euclid


[1] Apanasovich, T. V., Genton, M. G., and Sun, Y. (2012). “A valid Matérn class of cross-covariance functions for multivariate random fields with any number of components.” Journal of the American Statistical Association, 107(497): 180-193. · Zbl 1261.62087 · doi:10.1080/01621459.2011.643197
[2] Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2014). Hierarchical Modeling and Analysis for Spatial Data. CRC Press. · Zbl 1358.62009
[3] Banerjee, S., Gelfand, A. E., Finley, A. O., and Sang, H. (2008). “Gaussian predictive process models for large spatial data sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(4): 825-848. · Zbl 05563371
[4] Barnard, J., McCulloch, R., and Meng, X.-L. (2000). “Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage.” Statistica Sinica, 10(4): 1281-1312. · Zbl 0980.62045
[5] Besag, J. (1974). “Spatial interaction and the statistical analysis of lattice systems.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 192-236. · Zbl 0327.60067
[6] Bloomfield, P. (2004). Fourier Analysis of Time Series: An Introduction. John Wiley & Sons. · Zbl 0994.62093
[7] Borch, T., Kretzschmar, R., Kappler, A., Cappellen, P. V., Ginder-Vogel, M., Voegelin, A., and Campbell, K. (2009). “Biogeochemical redox processes and their impact on contaminant dynamics.” Environmental Science & Technology, 44(1): 15-23.
[8] Cressie, N. and Wikle, C. K. (2011). Statistics for Spatio-Temporal Data. John Wiley & Sons. · Zbl 1273.62017
[9] Dahlhaus, R. (1983). “Spectral analysis with tapered data.” Journal of Time Series Analysis, 4(3): 163-175. · Zbl 0552.62068
[10] Dahlhaus, R. and Künsch, H. (1987). “Edge effects and efficient parameter estimation for stationary random fields.” Biometrika, 74(4): 877-882. · Zbl 0633.62094
[11] Datta, A., Banerjee, S., Finley, A. O., and Gelfand, A. E. (2014). “Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets.” arXiv:1406.7343.
[12] Fuentes, M. (2002). “Spectral methods for nonstationary spatial processes.” Biometrika, 89(1): 197-210. · Zbl 0997.62073
[13] Fuentes, M. (2007). “Approximate likelihood for large irregularly spaced spatial data.” Journal of the American Statistical Association, 102(477): 321-331. · Zbl 1284.62589
[14] Gelfand, A. E. and Vounatsou, P. (2003). “Proper multivariate conditional autoregressive models for spatial data analysis.” Biostatistics, 4(1): 11-15. · Zbl 1142.62393
[15] Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2014). Bayesian Data Analysis, volume 2. Taylor & Francis. · Zbl 1279.62004
[16] Guinness, J. and Fuentes, M. (2016). “Circulant embedding of approximate covariances for inference from Gaussian data on large lattices.” Journal of Computational and Graphical Statistics. · doi:10.1080/10618600.2016.1164534
[17] Guinness, J., Fuentes, M., Hesterberg, D., and Polizzotto, M. (2014). “Multivariate spatial modeling of conditional dependence in microscale soil elemental composition data.” Spatial Statistics, 9: 93-108.
[18] Guyon, X. (1982). “Parameter estimation for a stationary process on a d-dimensional lattice.” Biometrika, 69(1): 95-105. · Zbl 0485.62107
[19] Handcock, M. S. and Stein, M. L. (1993). “A Bayesian analysis of kriging.” Technometrics, 35(4): 403-410.
[20] Hoff, P. D. (2007). “Extending the rank likelihood for semiparametric copula estimation.” The Annals of Applied Statistics, 1(1): 265-283. · Zbl 1129.62050
[21] Hoff, P. D. (2010). A First Course in Bayesian Statistical Methods. New York: Springer Dordrecht Heidelberg London New York. · Zbl 1213.62044
[22] Jiang, J.-Q., Ashekuzzaman, S., Jiang, A., Sharifuzzaman, S., and Chowdhury, S. R. (2012). “Arsenic contaminated groundwater and its treatment options in Bangladesh.” International Journal of Environmental Research and Public Health, 10(1): 18-46.
[23] Jin, X., Banerjee, S., and Carlin, B. P. (2007). “Order-free co-regionalized areal data models with application to multiple-disease mapping.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(5): 817-838. · Zbl 07555376
[24] Jin, X., Carlin, B. P., and Banerjee, S. (2005). “Generalized hierarchical multivariate CAR models for areal data.” Biometrics, 61(4): 950-961. · Zbl 1087.62127
[25] Kim, H.-M., Mallick, B. K., and Holmes, C. (2005). “Analyzing nonstationary spatial data using piecewise Gaussian processes.” Journal of the American Statistical Association, 100(470): 653-668. · Zbl 1117.62368
[26] Komárek, M., Vaněk, A., and Ettler, V. (2013). “Chemical stabilization of metals and arsenic in contaminated soils using oxides – a review.” Environmental Pollution, 172: 9-22.
[27] Koopmans, L. H. (1995). The Spectral Analysis of Time Series. Academic Press. · Zbl 0289.62056
[28] Lindgren, F., Rue, H., and Lindström, J. (2011). “An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(4): 423-498. · Zbl 1274.62360
[29] Manceau, A., Marcus, M., and Lenoir, T. (2014). “Estimating the number of pure chemical components in a mixture by X-ray absorption spectroscopy.” Journal of Synchrotron Radiation, 21(5): 1140-1147.
[30] Mardia, K. (1988). “Multi-dimensional multivariate Gaussian Markov random fields with application to image processing.” Journal of Multivariate Analysis, 24(2): 265-284. · Zbl 0637.60065
[31] Polizzotto, M. L., Kocar, B. D., Benner, S. G., Sampson, M., and Fendorf, S. (2008). “Near-surface wetland sediments as a source of arsenic release to ground water in Asia.” Nature, 454(7203): 505-508.
[32] Priestley, M. B. (1981). Spectral Analysis and Time Series, Vol. 2, Academic Press. · Zbl 0537.62075
[33] Ravenscroft, P., Brammer, H., and Richards, K. (2009). Arsenic Pollution: A Global Synthesis, volume 28. John Wiley & Sons.
[34] Reich, B. J. and Fuentes, M. (2012). “Nonparametric Bayesian models for a spatial covariance.” Statistical Methodology, 9(1): 265-274. · Zbl 1248.62170
[35] Ritter, C. and Tanner, M. A. (1992). “Facilitating the Gibbs sampler: the Gibbs stopper and the griddy-Gibbs sampler.” Journal of the American Statistical Association, 87(419): 861-868.
[36] Sain, S. R. and Cressie, N. (2007). “A spatial model for multivariate lattice data.” Journal of Econometrics, 140(1): 226-259. · Zbl 1418.62368
[37] Sain, S. R., Furrer, R., Cressie, N., et al. (2011). “A spatial analysis of multivariate output from regional climate models.” The Annals of Applied Statistics, 5(1): 150-175. · Zbl 1220.62152
[38] Sang, H. and Huang, J. Z. (2012). “A full scale approximation of covariance functions for large spatial data sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74(1): 111-132. · Zbl 1411.62274
[39] Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Linde, A. (2014). “The deviance information criterion: 12 years on.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(3): 485-493. · Zbl 1411.62027
[40] Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Science & Business Media. · Zbl 0924.62100
[41] Stein, M. L. (2014). “Limitations on low rank approximations for covariance matrices of spatial data.” Spatial Statistics, 8: 1-19.
[42] Stein, M. L., Chi, Z., and Welty, L. J. (2004). “Approximating likelihoods for large spatial data sets.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66(2): 275-296. · Zbl 1062.62094
[43] Stroud, J. R., Stein, M. L., and Lysen, S. (2014). “Bayesian and Maximum Likelihood Estimation for Gaussian Processes on an Incomplete Lattice.” arXiv:1402.4281.
[44] Terres, M. A., Fuentes, M., Hesterberg, D., and Polizzotto, M. (2016). “Supplementary material of “Bayesian spectral modeling for multivariate spatial distributions of elemental concentrations in soil”.” Bayesian Analysis. · Zbl 1462.62709
[45] Tukey, J. W. (1967). “An introduction to the calculations of numerical spectrum analysis.” Spectral Analysis of Time Series, 25.
[46] Whittle, P. (1954). “On stationary processes in the plane.” Biometrika, 434-449. · Zbl 0058.35601
[47] Yaglom, A. (1987). Correlation Theory of Stationary and Related Random Functions: Vol. 1: Basic Results. Springer-Verlag. · Zbl 0685.62078
[48] Zhang, H. (2004). “Inconsistent estimation and asymptotically equal interpolations in model-based geostatistics.” Journal of the American Statistical Association, 99(465): 250-261. · Zbl 1089.62538
[49] Zhang, Y., Hodges, J. S., and Banerjee, S. (2009). “Smoothed ANOVA with spatial effects as a competitor to MCAR in multivariate spatial smoothing.” The Annals of Applied Statistics, 3(4): 1805. · Zbl 1184.62126 · doi:10.1214/09-AOAS267
[50] Zimmerman, D. L. (1989). “Computationally exploitable structure of covariance matrices and generalized convariance matrices in spatial models.” Journal of Statistical Computation and Simulation, 32(1-2): 1-15. · Zbl 0726.62162
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.