×

Scalable Bayesian inference for the inverse temperature of a hidden Potts model. (English) Zbl 1437.62324

Summary: The inverse temperature parameter of the Potts model governs the strength of spatial cohesion and therefore has a major influence over the resulting model fit. A difficulty arises from the dependence of an intractable normalising constant on the value of this parameter and thus there is no closed-form solution for sampling from the posterior distribution directly. There is a variety of computational approaches for sampling from the posterior without evaluating the normalising constant, including the exchange algorithm and approximate Bayesian computation (ABC). A serious drawback of these algorithms is that they do not scale well for models with a large state space, such as images with a million or more pixels. We introduce a parametric surrogate model, which approximates the score function using an integral curve. Our surrogate model incorporates known properties of the likelihood, such as heteroskedasticity and critical temperature. We demonstrate this method using synthetic data as well as remotely-sensed imagery from the Landsat-8 satellite. We achieve up to a hundredfold improvement in the elapsed runtime, compared to the exchange algorithm or ABC. An open-source implementation of our algorithm is available in the R package bayesImageS.

MSC:

62M05 Markov processes: estimation; hidden Markov models
62M40 Random fields; image analysis
62F15 Bayesian inference
62H35 Image analysis in multivariate analysis
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Alston, C. L., Mengersen, K. L., Robert, C. P., Thompson, J. M., Littlefield, P. J., Perry, D., and Ball, A. J. (2007). “Bayesian mixture models in a longitudinal setting for analysing sheep CAT scan images.” Computational Statistics & Data Analysis, 51: 4282-4296. · Zbl 1162.62428
[2] Andrieu, C. and Thoms, J. (2008). “A tutorial on adaptive MCMC.” Statistics and Computing, 18(4): 343-373.
[3] Baxter, R. J. (1973). “Potts model at the critical temperature.” Journal of Physics. C. Solid State Physics, 6(23): L445. · Zbl 0799.60098
[4] Besag, J. (1974). “Spatial interaction and the statistical analysis of lattice systems.” Journal of the Royal Statistical Society. Series B. Methodological, 36(2): 192-236. · Zbl 0327.60067
[5] Boland, A., Friel, N., and Maire, F. (2017). “Efficient MCMC for Gibbs Random Fields using pre-computation.” arXiv preprint arXiv:1710.04093 [stat.CO]. URL https://arxiv.org/abs/1710.04093. · Zbl 1433.62086
[6] Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., and Riddell, A. (2017). “Stan: A Probabilistic Programming Language.” Journal of Statistical Software, 76(1): 1-32.
[7] Christen, J. A. and Fox, C. (2005). “Markov chain Monte Carlo Using an Approximation.” Journal of Computational and Graphical Statistics, 14(4): 795-810.
[8] Conrad, P. R., Marzouk, Y. M., Pillai, N. S., and Smith, A. (2016). “Accelerating Asymptotically Exact MCMC for Computationally Intensive Models via Local Approximations.” Journal of the American Statistical Association, 111(516): 1591-1607.
[9] Cook, S. R., Gelman, A., and Rubin, D. B. (2006). “Validation of Software for Bayesian Models Using Posterior Quantiles.” Journal of Computational and Graphical Statistics, 15(3): 675-692.
[10] Cooper, C. and Frieze, A. M. (1999). “Mixing properties of the Swendsen-Wang process on classes of graphs.” Random Structures & Algorithms, 15(3-4): 242-261. · Zbl 0945.82003
[11] Cucala, L. and Marin, J.-M. (2013). “Bayesian Inference on a Mixture Model With Spatial Dependence.” Journal of Computational and Graphical Statistics, 22(3): 584-597.
[12] Cucala, L., Marin, J.-M., Robert, C. P., and Titterington, D. M. (2009). “A Bayesian Reassessment of Nearest-Neighbor Classification.” Journal of the American Statistical Association, 104(485): 263-273. · Zbl 1388.62183
[13] Drovandi, C. C., Moores, M. T., and Boys, R. J. (2018). “Accelerating pseudo-marginal MCMC using Gaussian processes.” Computational Statistics & Data Analysis, 118: 1-17. · Zbl 1469.62057
[14] Drovandi, C. C., Pettitt, A. N., and Faddy, M. J. (2011). “Approximate Bayesian computation using indirect inference.” Journal of the Royal Statistical Society. Series C, Applied Statistics, 60(3): 317-337.
[15] Drovandi, C. C., Pettitt, A. N., and Lee, A. (2015). “Bayesian indirect inference using a parametric auxiliary model.” Statistical Science, 30(1): 72-95. · Zbl 1332.62088
[16] Eddelbuettel, D. and Sanderson, C. (2014). “RcppArmadillo: Accelerating R with high-performance C++ linear algebra.” Computational Statistics & Data Analysis, 71: 1054-63. · Zbl 1471.62055
[17] Everitt, R. G. (2012). “Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks.” Journal of Computational and Graphical Statistics, 21(4): 940-960.
[18] Feng, D. (2008). “Bayesian hidden Markov normal mixture models with application to MRI tissue classification.” Ph.D. thesis, University of Iowa.
[19] Feng, D. and Tierney, L. (2011). PottsUtils: Utility Functions of the Potts Models. R package version 0.2-2. URL http://CRAN.R-project.org/package=PottsUtils
[20] Flood, N. (2014). “Continuity of Reflectance Data between Landsat-7 ETM+ and Landsat-8 OLI, for Both Top-of-Atmosphere and Surface Reflectance: A Study in the Australian Landscape.” Remote Sensing, 6(9): 7952-7970.
[21] Friel, N. and Rue, H. (2007). “Recursive computing and simulation-free inference for general factorizable models.” Biometrika, 94(3): 661-672. · Zbl 1135.62078
[22] Garthwaite, P. H., Fan, Y., and Sisson, S. A. (2015). “Adaptive optimal scaling of Metropolis-Hastings algorithms using the Robbins-Monro process.” Communications in Statistics. Theory and Methods, 45(17): 5098-5111. · Zbl 1397.65019
[23] Gelman, A. (2017). “Correction to Cook, Gelman, and Rubin (2006).” Journal of Computational and Graphical Statistics, 26: 940.
[24] Geman, S. and Geman, D. (1984). “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6: 721-41. · Zbl 0573.62030
[25] Geweke, J. (2004). “Getting it Right: Joint Distribution Tests of Posterior Simulators.” Journal of the American Statistical Association, 99: 799-804. · Zbl 1117.62344
[26] Green, P. J. and Richardson, S. (2002). “Hidden Markov models and disease mapping.” Journal of the American Statistical Association, 97: 1055-1070. · Zbl 1046.62117
[27] Grelaud, A., Robert, C. P., Marin, J.-M., Rodolphe, F., and Taly, J.-F. (2009). “ABC likelihood-free methods for model choice in Gibbs random fields.” Bayesian Analysis, 4(2): 317-336. · Zbl 1330.62126
[28] Gutmann, M. U. and Corander, J. (2016). “Bayesian Optimization for Likelihood-free Inference of Simulator-based Statistical Models.” Journal of Machine Learning Research, 17(1): 4256-4302. · Zbl 1392.62072
[29] Henderson, V., Storeygard, A., and Weil, D. N. (2011). “A Bright Idea for Measuring Economic Growth.” The American Economic Review, 101(3): 194-199.
[30] Huang, K. (2010). Introduction to Statistical Physics. Boca Raton: Chapman & Hall/CRC Press, 2nd edition.
[31] Huber, M. L. (2003). “A bounding chain for Swendsen-Wang.” Random Structures & Algorithms, 22(1): 43-59. · Zbl 1013.60084
[32] Huber, M. L. (2016). Perfect Simulation, volume 148 of Monographs on Statistics & Applied Probability. Boca Raton, FL: Chapman & Hall/CRC Press. · Zbl 1343.00022
[33] Järvenpää, M., Gutmann, M., Vehtari, A., and Marttinen, P. (2018). “Gaussian process modeling in approximate Bayesian computation to estimate horizontal gene transfer in bacteria.” The Annals of Applied Statistics, 12(4): 2228-2251. · Zbl 1411.62320
[34] Lee, A. and Łatuszyński, K. (2014). “Variance bounding and geometric ergodicity of Markov chain Monte Carlo kernels for approximate Bayesian computation.” Biometrika, 101(3): 655-671. · Zbl 1334.60149
[35] Li, S. Z. (2009). Markov Random Field Modeling in Image Analysis. Dordrecht: Springer, 3rd edition. · Zbl 1183.68691
[36] Liang, F. (2010). “A double Metropolis Hastings sampler for spatial models with intractable normalizing constants.” Journal of Statistical Computation and Simulation, 80(9): 1007-1022. · Zbl 1233.62117
[37] Liang, F., Jin, I. H., Song, Q., and Liu, J. S. (2016). “An Adaptive Exchange Algorithm for Sampling from Distributions with Intractable Normalizing Constants.” Journal of the American Statistical Association, 111(513): 377-393.
[38] Lyne, A.-M., Girolami, M., Atchadé, Y., Strathmann, H., and Simpson, D. (2015). “On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods.” Statistical Science, 30(4): 443-467. · Zbl 1426.62092
[39] McClain, C. R. (2009). “A Decade of Satellite Ocean Color Observations.” Annual Review of Marine Science, 1: 19-42.
[40] McGrory, C. A., Pettitt, A. N., Reeves, R., Griffin, M., and Dwyer, M. (2012). “Variational Bayes and the Reduced Dependence Approximation for the Autologistic Model on an Irregular Grid With Applications.” Journal of Computational and Graphical Statistics, 21(3): 781-796.
[41] McGrory, C. A., Titterington, D., Reeves, R., and Pettitt, A. N. (2009). “Variational Bayes for estimating the parameters of a hidden Potts model.” Statistics and Computing, 19(3): 329-340.
[42] Meeds, E. and Welling, M. (2014). “GPS-ABC: Gaussian Process Surrogate Approximate Bayesian Computation.” In Proc. 30th Conf. UAI, 593-602. Quebec City, Canada: AUAI Press.
[43] Minvielle, P., Doucet, A., Marrs, A., and Maskell, S. (2010). “A Bayesian approach to joint tracking and identification of geometric shapes in video sequences.” Image and Vision Computing, 28(1): 111-123.
[44] Mira, A., Møller, J., and Roberts, G. O. (2001). “Perfect slice samplers.” Journal of the Royal Statistical Society. Series B. Methodological, 63(3): 593-606. · Zbl 0993.65015
[45] Møller, J., Pettitt, A. N., Reeves, R., and Berthelsen, K. K. (2006). “An Efficient Markov Chain Monte Carlo Method for Distributions with Intractable Normalising Constants.” Biometrika, 93(2): 451-458. · Zbl 1158.62020
[46] Monahan, J. F. and Boos, D. D. (1992). “Proper Likelihoods for Bayesian Analysis.” Biometrika, 79(2): 271-278. · Zbl 0751.62012
[47] Moores, M. T., Drovandi, C. C., Mengersen, K., and Robert, C. P. (2015). “Pre-processing for approximate Bayesian computation in image analysis.” Statistics and Computing, 25(1): 23-33. · Zbl 1331.62158
[48] Moores, M. T. and Mengersen, K. (2014). “Bayesian approaches to spatial inference: modelling and computational challenges and solutions.” AIP Conference Proceedings, 1636: 112-117.
[49] Moores, M. T. and Mengersen, K. (2018). “bayesImageS: Bayesian methods for image segmentation using a Potts model.” R package version 0.5-3. URL https://CRAN.R-project.org/package=bayesImageS
[50] Moores, M. T., Nicholls, G. K., Pettitt, A. N., and Mengersen, K. (2018). Supplements to “Scalable Bayesian Inference for the Inverse Temperature of a Hidden Potts Model.” doi: https://doi.org/10.1214/18-BA1130SUPPA, https://doi.org/10.1214/18-BA1130SUPPB.
[51] Murray, I., Ghahramani, Z., and MacKay, D. J. C. (2006). “MCMC for Doubly-intractable Distributions.” In Proc. 22nd Conf. UAI, 359-366. Arlington, VA: AUAI Press.
[52] NASA (2011). “Landsat 7 Science Data Users Handbook.” Technical report, National Aeronautics and Space Administration, Greenbelt, MD. URL http://landsathandbook.gsfc.nasa.gov/
[53] Neal, R. M. (2005). “Taking Bigger Metropolis Steps by Dragging Fast Variables.” arXiv preprint arXiv:math/0502099 [math.ST]. URL https://arxiv.org/abs/math/0502099
[54] Pickard, D. K. (1987). “Inference for Discrete Markov Fields: The Simplest Nontrivial Case.” Journal of the American Statistical Association, 82(397): 90-96. · Zbl 0621.62091
[55] Potts, R. B. (1952). “Some generalized order-disorder transformations.” Mathematical Proceedings of the Cambridge Philosophical Society, 48: 106-9. · Zbl 0048.45601
[56] Prangle, D. (2016). “Lazy ABC.” Statistics and Computing, 26(1): 171-185. · Zbl 1342.62036
[57] Prangle, D., Blum, M. G. B., Popovic, G., and Sisson, S. A. (2014). “Diagnostic tools for approximate Bayesian computation using the coverage property.” Australian & New Zealand Journal of Statistics, 56(4): 309-329. · Zbl 1335.62058
[58] Propp, J. G. and Wilson, D. B. (1996). “Exact sampling with coupled Markov chains and applications to statistical mechanics.” Random Structures & Algorithms, 9(1-2): 223-252. · Zbl 0859.60067
[59] Pudlo, P., Marin, J.-M., Estoup, A., Cornuet, J.-M., Gautier, M., and Robert, C. P. (2016). “Reliable ABC model choice via random forests.” Bioinformatics, 32(6): 859-866.
[60] Reeves, R. and Pettitt, A. N. (2004). “Efficient Recursions for General Factorisable Models.” Biometrika, 91(3): 751-757. · Zbl 1162.62310
[61] Richards, F. J. (1959). “A Flexible Growth Function for Empirical Use.” Journal of Experimental Botany, 10(2): 290-301.
[62] Roy, D. P., Kovalskyy, V., Zhang, H. K., Vermote, E. F., Yan, L., Kumar, S. S., and Egorov, A. (2016). “Characterization of Landsat 7 to Landsat 8 reflective wavelength and normalized difference vegetation index continuity.” Remote Sensing of Environment, 185: 57-70.
[63] Ryan, C. M., Drovandi, C. C., and Pettitt, A. N. (2016). “Optimal Bayesian Experimental Design for Models with Intractable Likelihoods Using Indirect Inference Applied to Biological Process Models.” Bayesian Analysis, 11(3): 857-883. · Zbl 1359.62321
[64] Rydén, T. and Titterington, D. M. (1998). “Computational Bayesian Analysis of Hidden Markov Models.” Journal of Computational and Graphical Statistics, 7(2): 194-211.
[65] Sherlock, C., Golightly, A., and Henderson, D. A. (2017). “Adaptive, delayed-acceptance MCMC for targets with expensive likelihoods.” Journal of Computational and Graphical Statistics, 26(2): 434-444.
[66] Simoncelli, E. P. (1999). “Bayesian Multi-Scale Differential Optical Flow.” In Jähne, B., Haussecker, H., and Geissler, P. (eds.), Handbook of computer vision and applications, volume 2, chapter 14, 397-422. San Diego: Academic Press.
[67] Small, C. (2001). “Estimation of urban vegetation abundance by spectral mixture analysis.” International Journal of Remote Sensing, 22(7): 1305-1334.
[68] Strathmann, H., Sejdinovic, D., Livingstone, S., Szabo, Z., and Gretton, A. (2015). “Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families.” In Advances in Neural Information Processing Systems, volume 28, 955-963.
[69] Swendsen, R. H. and Wang, J.-S. (1987). “Nonuniversal critical dynamics in Monte Carlo simulations.” Physical Review Letters, 58: 86-88.
[70] Talts, S., Betancourt, M., Simpson, D., Vehtari, A., and Gelman, A. (2018). “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” arXiv preprint arXiv:1804.06788 [stat.ME]. URL https://arxiv.org/abs/1804.06788
[71] Tucker, C. J. (1979). “Red and photographic infrared linear combinations for monitoring vegetation.” Remote Sensing of Environment, 8(2): 127-150.
[72] USGS (2016). “Landsat 8 Data Users Handbook.” Technical Report LSDS-1574, United States Geological Survey, Sioux Falls, SD. Version 2.0. URL https://landsat.usgs.gov/landsat-8-l8-data-users-handbook
[73] Vermote, E., Justice, C., Claverie, M., and Franch, B. (2016). “Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product.” Remote Sensing of Environment, 185: 46-56.
[74] Wilkinson, R. D. (2014). “Accelerating ABC methods using Gaussian processes.” In Proc. 17th Int. Conf. AISTATS, volume 33 of JMLR W&CP, 1015-1023. Reykjavik, Iceland: MIT Press.
[75] Winkler, G. (2003). Image Analysis, Random Fields and Markov chain Monte Carlo Methods: A Mathematical Introduction. Berlin Heidelberg: Springer-Verlag, 2nd edition. · Zbl 1008.68147
[76] Zhang, C., Shahbaba, B., and Zhao, H. (2017). “Precomputing Strategy for Hamiltonian Monte Carlo Method Based on Regularity in Parameter Space.” Computational Statistics, 32(1): 253-279. · Zbl 1417.65018
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.