Interoperability of statistical models in pandemic preparedness: principles and reality. (English) Zbl 07535199

Summary: We present interoperability as a guiding framework for statistical modelling to assist policy makers asking multiple questions using diverse datasets in the face of an evolving pandemic response. Interoperability provides an important set of principles for future pandemic preparedness, through the joint design and deployment of adaptable systems of statistical models for disease surveillance using probabilistic reasoning. We illustrate this through case studies for inferring and characterising spatial-temporal prevalence and reproduction numbers of SARS-CoV-2 infections in England.


62-XX Statistics


Julia; Turing
Full Text: DOI arXiv


[1] ADES, A. E. and SUTTON, A. J. (2006). Multiparameter evidence synthesis in epidemiology and medical decision-making: Current approaches. J. Roy. Statist. Soc. Ser. A 169 5-35. · doi:10.1111/j.1467-985X.2005.00377.x
[2] ANDERSON, R., DONNELLY, C., HOLLINGSWORTH, D., KEELING, M., VEGVARI, C., BAGGALEY, R. and MADDREN, R. (2020). Reproduction number (R) and growth rate (r) of the COVID-19 epidemic in the UK: Methods of estimation, data sources, causes of heterogeneity, and use as a guide in policy formulation Technical Report London, UK: Royal Society.
[3] BI, Q. et al. (2020). Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: A retrospective cohort study. Lancet Infect. Dis. 20 911-919.
[4] BIRRELL, P., BLAKE, J., VAN LEEUWEN, E., GENT, N. and DE ANGELIS, D. (2021). Real-time nowcasting and forecasting of COVID-19 dynamics in England: The first wave. Philos. Trans. - R. Soc., Biol. Sci. 376 20200279. · doi:10.1098/rstb.2020.0279
[5] BRACHER, J., RAY, E. L., GNEITING, T. and REICH, N. G. (2021). Evaluating epidemic forecasts in an interval format. PLoS Comput. Biol. 17 e1008618. · doi:10.1371/journal.pcbi.1008618
[6] BRAUER, F., VAN DEN DRIESSCHE, P. and WU, J., eds. (2008) In Mathematical Epidemiology. Lecture Notes in Math. 1945. Springer, Berlin. · Zbl 1159.92034 · doi:10.1007/978-3-540-78911-6
[7] BRAUNER, J. M., MINDERMANN, S., SHARMA, M., JOHNSTON, D., SALVATIER, J., GAVENČIAK, T., STEPHENSON, A. B., LEECH, G., ALTMAN, G. et al. (2021). Inferring the effectiveness of government interventions against COVID-19. Science 371. · doi:10.1126/science.abd9338
[8] CARMONA, C. U. and NICHOLLS, G. K. (2020). Semi-modular inference: Enhanced learning in multi-modular models by tempering the influence of components. Available at arXiv:2003.06804.
[9] CHEN, P. M., LEE, E. K., GIBSON, G. A., KATZ, R. H. and PATTERSON, D. A. (1994). RAID: High-performance, reliable secondary storage. ACM Comput. Surv. 26 145-185. · doi:10.1145/176979.176981
[10] COVID-19 INFECTION SURVEY—OFFICE FOR NATIONAL STATISTICS. Available at https://www.ons.gov.uk/surveys/informationforhouseholdsandindividuals/householdandindividualsurveys/covid19infectionsurvey.
[11] COVID-19 TASK FORCE. Available at https://rss.org.uk/policy-campaigns/policy-groups/covid-19-task-force/.
[12] DANIELS, M. J. and KASS, R. E. (1998). A note on first-stage approximation in two-stage hierarchical models. Sankhya, Ser. B 60 19-30. · Zbl 1081.62513
[13] DAVISON, A. C. (2003). Statistical Models. Cambridge Series in Statistical and Probabilistic Mathematics 11. Cambridge Univ. Press, Cambridge. · Zbl 1145.62001 · doi:10.1017/CBO9780511815850
[14] DAWID, A. P. (1985). Probability, symmetry and frequency. British J. Philos. Sci. 36 107-128. · Zbl 0588.62001 · doi:10.1093/bjps/36.2.107
[15] DAWID, A. P. and LAURITZEN, S. L. (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272-1317. · Zbl 0815.62038 · doi:10.1214/aos/1176349260
[16] DE ANGELIS, D., PRESANIS, A. M., BIRRELL, P. J., TOMBA, G. S. and HOUSE, T. (2015). Four key challenges in infectious disease modelling using data from multiple sources. Epidemics 10 83-87. · doi:10.1016/j.epidem.2014.09.004
[17] DEPARTMENT OF HEALTH AND SOCIAL CARE (UK). COVID-19 testing data: Methodology note. Available at https://www.gov.uk/government/publications/coronavirus-covid-19-testing-data-methodology/covid-19-testing-data-methodology-note.
[18] DEPARTMENT OF HEALTH AND SOCIAL CARE GUIDANCE. REPRODUCTION NUMBER (R) AND GROWTH RATE: METHODOLOGY. Available at https://www.gov.uk/government/publications/reproduction-number-r-and-growth-rate-methodology.
[19] DOMINICI, F., SAMET, J. M. and ZEGER, S. L. (2000). Combining evidence on air pollution and daily mortality from the 20 largest US cities: A hierarchical modelling strategy. J. Roy. Statist. Soc. Ser. A 163 263-302.
[20] FLAXMAN, S., MISHRA, S., GANDY, A., UNWIN, H. J. T., MELLAN, T. A., COUPLAND, H., WHITTAKER, C., ZHU, H., BERAH, T. et al. (2020). Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature 584 257-261.
[21] GE, H., XU, K. and GHAHRAMANI, Z. (2018). Turing: A language for flexible probabilistic inference. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics (A. Storkey and F. Perez-Cruz, eds.). Proceedings of Machine Learning Research 84 1682-1690. PMLR, Playa Blanca, Lanzarote, Canary Islands.
[22] GIT. Available at https://git-scm.com/.
[23] GOUDIE, R. J. B., HOVORKA, R., MURPHY, H. R. and LUNN, D. (2015). Rapid model exploration for complex hierarchical data: Application to pharmacokinetics of insulin aspart. Stat. Med. 34 3144-3158. · doi:10.1002/sim.6536
[24] GOUDIE, R. J. B., PRESANIS, A. M., LUNN, D., DE ANGELIS, D. and WERNISCH, L. (2019). Joining and splitting models with Markov melding. Bayesian Anal. 14 81-109. · Zbl 1409.62153 · doi:10.1214/18-BA1104
[25] GREEN, P. J., HJORT, N. L. and RICHARDSON, S. (2003). Highly Structured Stochastic Systems. Oxford Statistical Science Series. Oxford University Press, Oxford, New York. · Zbl 1044.62110
[26] HELLEWELL, J., RUSSELL, T. W., THE SAFER INVESTIGATORS AND FIELD STUDY TEAM, THE CRICK COVID-19 CONSORTIUM, CMMID COVID-19 WORKING GROUP, BEALE, R., KELLY, G., HOULIHAN, C., NASTOULI, E. et al. (2020). Estimating the effectiveness of routine asymptomatic PCR testing at different frequencies for the detection of SARS-CoV-2 infections. MedRxiv 2020.11.24.20229948. · doi:10.1101/2020.11.24.20229948
[27] HINTON, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Comput. 14 1771-1800. · Zbl 1010.68111 · doi:10.1162/089976602760128018
[28] HOOTEN, M. B., JOHNSON, D. S. and BROST, B. M. (2021). Making recursive Bayesian inference accessible. Amer. Statist. 75 185-194. · Zbl 07632854 · doi:10.1080/00031305.2019.1665584
[29] JACOB, P. E., MURRAY, L. M., HOLMES, C. C. and ROBERT, C. P. (2017). Better together? Statistical learning in models made of modules. Available at arXiv:1708.08719.
[30] Jacob, P. E., O’Leary, J. and Atchadé, Y. F. (2020). Unbiased Markov chain Monte Carlo methods with couplings. J. R. Stat. Soc. Ser. B. Stat. Methodol. 82 543-600. · Zbl 07554766 · doi:10.1111/rssb.12336
[31] JOHNSON, D. S., BROST, B. M. and HOOTEN, M. B. (2020). Greater than the sum of its parts: Computationally flexible Bayesian hierarchical modeling. Available at arXiv:2010.12568.
[32] THE JULIA PROGRAMMING LANGUAGE. Available at https://julialang.org/.
[33] LEE, D. and SARRAN, C. (2015). Controlling for unmeasured confounding and spatial misalignment in long-term air pollution and health studies. Environmetrics 26 477-487. · Zbl 1525.62165 · doi:10.1002/env.2348
[34] LINDSTEN, F., JOHANSEN, A. M., NAESSETH, C. A., KIRKPATRICK, B., SCHÖN, T. B., ASTON, J. A. D. and BOUCHARD-CÔTÉ, A. (2017). Divide-and-conquer with sequential Monte Carlo. J. Comput. Graph. Statist. 26 445-458. · doi:10.1080/10618600.2016.1237363
[35] Liu, F., Bayarri, M. J. and Berger, J. O. (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Bayesian Anal. 4 119-150. · Zbl 1330.65033 · doi:10.1214/09-BA404
[36] LIU, Y. and GOUDIE, R. J. B. (2021). Generalized geographically weighted regression model within a modularized Bayesian framework. Available at arXiv:2106.00996.
[37] LIU, Y. and GOUDIE, R. J. B. (2022). Stochastic approximation cut algorithm for inference in modularized Bayesian models. Stat. Comput. 32 Paper No. 7. · Zbl 1477.62013 · doi:10.1007/s11222-021-10070-2
[38] LUNN, D., BARRETT, J., SWEETING, M. and THOMPSON, S. (2013). Fully Bayesian hierarchical modelling in two stages, with application to meta-analysis. J. R. Stat. Soc. Ser. C. Appl. Stat. 62 551-572. · doi:10.1111/rssc.12007
[39] LUNN, D., BEST, N., SPIEGELHALTER, D., GRAHAM, G. and NEUENSCHWANDER, B. (2009). Combining MCMC with ’sequential’ PKPD modelling. J. Pharmacokinet. Pharmacodyn. 36 19-38. · doi:10.1007/s10928-008-9109-1
[40] MAISHMAN, T., SCHAAP, S., SILK, D. S., NEVITT, S. J., WOODS, D. C. and BOWMAN, V. E. (2021). Statistical methods used to combine the effective reproduction number, \[R(t)\], and other related measures of COVID-19 in the UK. Available at arXiv:2103.01742.
[41] MANDERSON, A. A. and GOUDIE, R. J. B. (2021). Combining chains of Bayesian models with Markov melding. Available at arXiv:2111.11566.
[42] MASSA, M. S. and LAURITZEN, S. L. (2010). Combining statistical models. In Algebraic Methods in Statistics and Probability II (M. A. G. Viana and H. P. Wynn, eds.). Contemp. Math. 516 239-259. Amer. Math. Soc., Providence, RI. · Zbl 1196.62004 · doi:10.1090/conm/516/10179
[43] MATHUR, R., RENTSCH, C., MORTON, C., HULME, W., SCHULTZE, A., MACKENNA, B., EGGO, R., BHASKARAN, K., WONG, A. et al. (2021). Ethnic differences in SARS-CoV-2 infection and COVID-19-related hospitalisation, intensive care unit admission, and death in 17 million adults in England: An observational cohort study using the OpenSAFELY platform. Lancet 397 1711-1724.
[44] MAUCORT-BOULCH, D., FRANCESCHI, S., PLUMMER, M. and IARC HPV PREVALENCE SURVEYS STUDY GROUP (2008). International correlation between human papillomavirus prevalence and cervical cancer incidence. Cancer Epidemiol. Biomark. Prev. 17 717-720. · doi:10.1158/1055-9965.EPI-07-2691
[45] MAUFF, K., STEYERBERG, E., KARDYS, I., BOERSMA, E. and RIZOPOULOS, D. (2020). Joint models with multiple longitudinal outcomes and a time-to-event outcome: A corrected two-stage approach. Stat. Comput. 30 999-1014. · Zbl 1447.62117 · doi:10.1007/s11222-020-09927-9
[46] MINISTRY OF HOUSING, COMMUNITIES & LOCAL GOVERNMENT. English indices of deprivation 2019. Available at https://www.gov.uk/government/statistics/english-indices-of-deprivation-2019.
[47] MORALES, D. R. and ALI, S. N. (2021). COVID-19 and disparities affecting ethnic minorities. Lancet 397 1684-1685. · doi:10.1016/S0140-6736(21)00949-1
[48] MUGGLIN, A. S., CARLIN, B. P. and GELFAND, A. E. (2000). Fully model-based approaches for spatially misaligned data. J. Amer. Statist. Assoc. 95 877-887. · doi:10.1080/01621459.2000.10474279
[49] NICHOLLS, G. K., LEE, J. E., WU, C.-H. and CARMONA, C. U. (2022). Valid belief updates for prequentially additive loss functions arising in semi-modular inference. Available at arXiv:2201.09706.
[50] NICHOLSON, G., BLANGIARDO, M., BRIERS, M., DIGGLE, P. J., FJELDE, T. E., GE, H., GOUDIE, R. J. B., JERSAKOVA, R., KING, R. E. et al. (2022). Supplement to “Interoperability of statistical models in pandemic preparedness: Principles and reality.” https://doi.org/10.1214/22-STS854SUPP
[51] NICHOLSON, G., LEHMANN, B., PADELLINI, T., POUWELS, K. B., JERSAKOVA, R., LOMAX, J., KING, R. E., MALLON, A.-M., DIGGLE, P. J. et al. (2022). Improving local prevalence estimates of SARS-CoV-2 infections using a causal debiasing framework. Nat. Microbiol. 7 97-107. · doi:10.1038/s41564-021-01029-0
[52] PADELLINI, T., JERSAKOVA, R., DIGGLE, P. J., HOLMES, C., KING, R. E., LEHMANN, B. C. L., MALLON, A.-M., NICHOLSON, G., RICHARDSON, S. et al. (2022). Time varying association between deprivation, ethnicity and SARS-CoV-2 infections in England: A population-based ecological study. The Lancet Regional Health—Europe 15 100322. · doi:10.1016/j.lanepe.2022.100322
[53] PIRANI, M., MASON, A. J., HANSELL, A. L., RICHARDSON, S. and BLANGIARDO, M. (2020). A flexible hierarchical framework for improving inference in area-referenced environmental health studies. Biom. J. 62 1650-1669. · Zbl 1464.62472 · doi:10.1002/bimj.201900241
[54] Plummer, M. (2015). Cuts in Bayesian graphical models. Stat. Comput. 25 37-43. · Zbl 1331.62041 · doi:10.1007/s11222-014-9503-z
[55] POMPE, E. and JACOB, P. E. (2021). Asymptotics of cut distributions and robust modular inference using Posterior Bootstrap. Available at arXiv:2110.11149.
[56] Poole, D. and Raftery, A. E. (2000). Inference for deterministic simulation models: The Bayesian melding approach. J. Amer. Statist. Assoc. 95 1244-1255. · Zbl 1072.62544 · doi:10.2307/2669764
[57] POUWELS, K. B., HOUSE, T., PRITCHARD, E., ROBOTHAM, J. V., BIRRELL, P. J., GELMAN, A., VIHTA, K.-D., BOWERS, N., BOREHAM, I. et al. (2021). Community prevalence of SARS-CoV-2 in England from April to November, 2020: Results from the ONS Coronavirus Infection Survey. The Lancet Public Health 6 e30-e38.
[58] PRESANIS, A. M., OHLSSEN, D., SPIEGELHALTER, D. J. and DE ANGELIS, D. (2013). Conflict diagnostics in directed acyclic graphs, with applications in Bayesian evidence synthesis. Statist. Sci. 28 376-397. · Zbl 1331.62160 · doi:10.1214/13-STS426
[59] RIEBLER, A., SØRBYE, S. H., SIMPSON, D. and RUE, H. (2016). An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Stat. Methods Med. Res. 25 1145-1165. · doi:10.1177/0962280216660421
[60] RILEY, S., AINSLIE, K. E., EALES, O., JEFFREY, B., WALTERS, C. E., ATCHISON, C. J., DIGGLE, P. J., ASHBY, D., DONNELLY, C. A. et al. (2020). Community prevalence of SARS-CoV-2 virus in England during May 2020: REACT study. MedRxiv.
[61] ROSE, T. C., MASON, K., PENNINGTON, A., MCHALE, P., TAYLOR-ROBINSON, D. C. and BARR, B. (2020). Inequalities in COVID19 mortality related to ethnicity and socioeconomic deprivation. MedRxiv.
[62] Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 319-392. · Zbl 1248.62156 · doi:10.1111/j.1467-9868.2008.00700.x
[63] SCOTT, S. L. (2002). Bayesian methods for hidden Markov models: Recursive computing in the 21st century. J. Amer. Statist. Assoc. 97 337-351. · Zbl 1073.65503 · doi:10.1198/016214502753479464
[64] Simpson, D., Rue, H., Riebler, A., Martins, T. G. and SØrbye, S. H. (2017). Penalising model component complexity: A principled, practical approach to constructing priors. Statist. Sci. 32 1-28. · Zbl 1442.62060 · doi:10.1214/16-STS576
[65] SØRBYE, S. H. and RUE, H. (2014). Scaling intrinsic Gaussian Markov random field priors in spatial modelling. Spat. Stat. 8 39-51. · doi:10.1016/j.spasta.2013.06.004
[66] TEH, Y. W., BHOOPCHAND, A., DIGGLE, P., ELESEDY, B., HE, B., HUTCHINSON, M., PAQUET, U., READ, J., TOMASEV, N. et al. (2021). Efficient Bayesian inference of instantaneous re-production numbers at fine spatial scales, with an application to mapping and nowcasting the Covid-19 epidemic in british local authorities. Technical report. To be published in Journal of the Royal Statistical Society, Series A (Statistics in Society). Available at https://localcovid.info/assets/docs/localcovid-writeup.pdf.
[67] UK DATA SERVICE CENSUS. Available at https://ukdataservice.ac.uk/learning-hub/census/.
[68] WELTON, N. J., SUTTON, A. J., COOPER, N. J., ABRAMS, K. R. and ADES, A. E. (2012). Evidence Synthesis for Decision Making in Healthcare. Wiley, Chichester.
[69] Yu, B. and Kumbier, K. (2020). Veridical data science. Proc. Natl. Acad. Sci. USA 117 3920-3929. · Zbl 1456.62321 · doi:10.1073/pnas.1901326117
[70] YU, X., NOTT, D. J. and SMITH, M. S. (2021). Variational inference for cutting feedback in misspecified models. Available at arXiv:2108.11066.
[71] ZHANG, L., BEAL, S. L. and SHEINER, L. B. (2003). Simultaneous vs. sequential analysis for population PK/PD data I: Best-case performance. J. Pharmacokinet. Pharmacodyn. 30 387-404 · doi:10.1023/b:jopa.0000012998.04442.1f
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.