zbMATH — the first resource for mathematics

Spatial risk mapping for rare disease with hidden Markov fields and variational EM. (English) Zbl 1288.62158
Summary: Current risk mapping models for pooled data focus on the estimated risk for each geographical unit. A risk classification, that is, grouping of geographical units with similar risk, is then necessary to easily draw interpretable maps, with clearly delimited zones in which protection measures can be applied. As an illustration, we focus on the Bovine Spongiform Encephalopathy (BSE) disease that threatened the bovine production in Europe and generated drastic cow culling. This example features typical animal disease risk analysis issues with very low risk values, small numbers of observed cases and population sizes that increase the difficulty of an automatic classification.
We propose to handle this task in a spatial clustering framework using a non-standard discrete hidden Markov model prior designed to favor a smooth risk variation. The model parameters are estimated using an EM algorithm and a mean field approximation for which we develop a new initialization strategy appropriate for spatial Poisson mixtures. Using both simulated and our BSE data, we show that our strategy performs well in dealing with low population sizes and accurately determines high risk regions, both in terms of localization and risk level estimation.

62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)
65C60 Computational problems in statistics (MSC2010)
62M05 Markov processes: estimation; hidden Markov models
Full Text: DOI Euclid
[1] Abrial, D., Calavas, D., Jarrige, N. and Ducrot, C. (2005a). Poultry, pig and the risk of BSE following the feed ban in France-A spatial analysis. Vet. Res. 36 615-628.
[2] Abrial, D., Calavas, D., Jarrige, N. and Ducrot, C. (2005b). Spatial heterogeneity of the risk of BSE in France following the ban of meat and bone meal in cattle feed. Prev. Vet. Med. 67 69-82.
[3] Alfó, M., Nieddu, L. and Vicari, D. (2009). Finite mixture models for mapping spatially dependent disease counts. Biom. J. 51 84-97.
[4] Allepuz, A., Lopez-Quilez, A., Forte, A., Fernandez, G. and Casal, J. (2007). Spatial analysis of bovine spongiform encephalopathy in Galicia, Spain (2002-2005). Prev. Vet. Med. 79 174-185.
[5] Besag, J., York, J. and Mollié, A. (1991). Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Statist. Math. 43 1-59. · Zbl 0760.62029
[6] Biernacki, C. (2004). Initializing EM using the properties of its trajectories in Gaussian mixtures. Stat. Comput. 14 267-279.
[7] Biernacki, C., Celeux, G. and Govaert, G. (2003). Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput. Statist. Data Anal. 41 561-575. · Zbl 1429.62235
[8] Böhning, D., Dietz, E. and Schlattmann, P. (2000). Space-time mixture modelling of public health data. Stat. Med. 19 2333-2344.
[9] Celeux, G., Forbes, F. and Peyrard, N. (2003). EM procedures using mean field-like approximations for Markov model-based image segmentation. Pattern Recognition 36 131-144. · Zbl 1010.68158
[10] Clayton, D. and Bernadinelli, L. (1992). Bayesian methods for mapping disease risk. In Geographical and Environment Epidemiology : Methods for Small Area Studies (P. Elliot, J. Cuzik, D. English and R. Stern, eds.) 205-220. Oxford Univ. Press, Oxford.
[11] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 39 1-38. · Zbl 0364.62022
[12] Dice, L. R. (1945). Measures of the amount of ecologic association between species. Ecology 26 297-302.
[13] Fernández, C. and Green, P. J. (2002). Modelling spatially correlated data via mixtures: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 805-826. · Zbl 1067.62029
[14] Forbes, F. and Peyrard, N. (2003). Hidden Markov model selection based on mean field like approximations. IEEE Trans. on Pattern Analysis and Machine Intelligence 25 1089-1101.
[15] Forbes, F., Charras-Garrido, M., Azizi, L., Doyle, S. and Abrial, D. (2013). Supplement to “Spatial risk mapping for rare disease with hidden Markov fields and variational EM.” . · Zbl 1288.62158
[16] Fraley, C. and Raftery, A. E. (2007). Bayesian regularization for normal mixture estimation and model-based clustering. J. Classification 24 155-181. · Zbl 1159.62302
[17] Green, P. J. and Richardson, S. (2002). Hidden Markov models and disease mapping. J. Amer. Statist. Assoc. 97 1055-1070. · Zbl 1046.62117
[18] Hossain, M. M. and Lawson, A. B. (2010). Space-time Bayesian small area disease risk models: Development and evaluation with a focus on cluster detection. Environ. Ecol. Stat. 17 73-95.
[19] Karlis, D. and Xekalaki, E. (2003). Choosing initial values for the EM algorithm for finite mixtures. Comput. Statist. Data Anal. 41 577-590. · Zbl 1429.62082
[20] Knorr-Held, L. and Rasser, G. (2000). Bayesian detection of clusters and discontinuities in disease maps. Biometrics 56 13-21. · Zbl 1060.62629
[21] Knorr-Held, L., Raßer, G. and Becker, N. (2002). Disease mapping of stage-specific cancer incidence data. Biometrics 58 492-501. · Zbl 1210.62173
[22] Knorr-Held, L. and Richardson, S. (2003). A hierarchical model for space-time surveillance data on meningococcal disease incidence. J. R. Stat. Soc. Ser. C. Appl. Stat. 52 169-183. · Zbl 1111.62347
[23] Kulldorff, M. (1997). A spatial scan statistic. Comm. Statist. Theory Methods 26 1481-1496. · Zbl 0920.62116
[24] Kulldorff, M. and Information Management Services Inc. (2009). SaTScanTM v8.0: Software for the spatial and space-time scan statistics. Available at .
[25] Kulldorff, M., Huang, L., Pickle, L. and Duczmal, L. (2006). An elliptic spatial scan statistic. Stat. Med. 25 3929-3943.
[26] Lawson, A. B. and Song, H.-R. (2010). Bayesian hierarchical modeling of the dynamics of spatio-temporal influenza season outbreaks. Spat. Spatiotemporal Epidemiol. 1 187-195.
[27] Lawson, A. B., Biggeri, A. B., Boehning, D., Lesaffre, E., Viel, J. F., Clark, A., Schlattmann, P. and Divino, F. (2000). Disease mapping models: An empirical evaluation. Disease Mapping Collaborative Group. Stat. Med. 19 2217-2241.
[28] MacNab, Y. C. (2011). On Gaussian Markov random fields and Bayesian disease mapping. Stat. Methods Med. Res. 20 49-68.
[29] McLachlan, G. and Peel, D. (2000). Finite Mixture Models . Wiley, New York. · Zbl 0963.62061
[30] Mollié, A. (1996). Bayesian mapping of disease. In Markov Chain Monte Carlo in Practice (W. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 359-379. Chapman & Hall, London. · Zbl 0849.62060
[31] Mollié, A. (1999). Bayesian and empirical Bayes approaches to disease mapping. In Disease Mapping and Risk Assessment for Public Health (A. Lawson, A. Biggeri and D. Bohning, eds.) 15-29. Wiley, New York. · Zbl 1072.62665
[32] Mollie, A. and Richardson, S. (1991). Empirical Bayes estimates of cancer mortality rates using spatial models. Stat. Med. 10 95-112.
[33] Pascutto, C., Wakefield, J. C., Best, N. G., Richardson, S., Bernardinelli, L., Staines, A. and Elliott, P. (2000). Statistical issues in the analysis of disease mapping data. Stat. Med. 19 2493-2519.
[34] Paul, M., Abrial, D., Jarrige, N., Rican, S., Garrido, M., Calavas, D. and Ducrot, C. (2007). Bovine spongiform encephalopathy and spatial analysis of the feed industry. Emerging Infectious Diseases 13 867-872.
[35] Richardson, S., Monfort, C., Green, M., Draper, G. and Muirhead, C. (1995). Spatial variation of natural radiation and childhood leukaemia incidence in Great Britain. Stat. Med. 14 2487-2501.
[36] Robertson, C., Nelson, T. A., MacNab, Y. C. and Lawson, A. B. (2010). Review of methods for space-time disease surveillance. Spat. Spatiotemporal Epidemiol. 1 105-116.
[37] Schlattmann, P. and Böhning, D. (1993). Mixture models and disease mapping. Stat. Med. 12 1943-1950.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.