Real-time estimation of COVID-19 infections: deconvolution and sensor fusion. (English) Zbl 07535200

Summary: We propose, implement, and evaluate a method to estimate the daily number of new symptomatic COVID-19 infections, at the level of individual U.S. counties, by deconvolving daily reported COVID-19 case counts using an estimated symptom-onset-to-case-report delay distribution. Importantly, we focus on estimating infections in real-time (rather than retrospectively), which poses numerous challenges. To address these, we develop new methodology for both the distribution estimation and deconvolution steps, and we employ a sensor fusion layer (which fuses together predictions from models that are trained to track infections based on auxiliary surveillance streams) in order to improve accuracy and stability.


62-XX Statistics


GitHub; EpiEstim
Full Text: DOI arXiv


[1] ABBOTT, S., HELLEWELL, J., THOMPSON, R. N., SHERRATT, K., GIBBS, H. P., BOSSE, N. I., MUNDAY, J. D., MEAKIN, S., DOUGHTY, E. L. et al. (2020). Estimating the time-varying reproduction number of Sars-CoV-2 using national and subnational case counts. Wellcome Open Research 5.
[2] ACKLEY, A. F., PILEWSKI, S., PETROVIC, V. S., WORDEN, L., MURRAY, E. and PORCO, T. C. (2020). Assessing the utility of a smart thermometer and mobile application as a surveillance tool for influenza and influenza-like illness. Health Inform. J. 26 2148-2158.
[3] BAVADEKAR, S., DAI, A., DAVIS, J., DESFONTAINES, D., ECKSTEIN, I., EVERETT, K., FABRIKANT, A., FLORES, G., GABRILOVICH, E. et al. (2020). Google COVID-19 search trends symptoms dataset: Anonymization process description. arXiv:2009.01265.
[4] BETTENCOURT, L. M. A. and RIBEIRO, R. M. (2008). Real time Bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS ONE 3 e2185.
[5] BROOKS, L. C. (2020). Pancasting: Forecasting epidemics from provisional data Ph.D. thesis Carnegie Mellon Univ.
[6] BROWNSTEIN, J. S., FREIFELD, C. C. and MADOFF, L. C. (2009). Digital disease detection — harnessing the web for public health surveillance. N. Engl. J. Med. 360 2153-2157.
[7] CARLSON, S. J., DALTON, C. B., BUTLER, M. T., FEJSA, J., ELVIDGE, E. and DURRHEIM, D. N. (2013). Flutracking weekly online community survey of influenza-like illness annual report 2011 and 2012. Communicable Diseases Intelligence Quarterly Report 37 E398-406.
[8] CHARU, V. (2017). Human mobility and the spatial transmission of influenza in the United States. PLoS Comput. Biol. 13 1-23.
[9] CHITWOOD, M. H., RUSSI, M., GUNASEKERA, K., HAVUMAKI, J., PITZER, V. E., SALOMON, J. A., SWARTWOOD, N., WARREN, J. L., WEINBERGER, D. M. et al. (2021). Reconstructing the course of the COVID-19 epidemic over 2020 for US states and counties: Results of a Bayesian evidence synthesis model. medRxiv.
[10] CORI, A., FERGUSON, N. M., FRASER, C. and CAUCHEMEZ, S. (2013). A new framework and software to estimate time-varying reproduction numbers during epidemics. Am. J. Epidemiol. 178 1505-1512.
[11] DEBEYE, H. W. J. and VAN RIEL, P. (1990). \[{L_p}\]-norm deconvolution. Geophys. Prospect. 38 381-403.
[12] DONG, E., DU, H. and GARDNER, L. (2020). An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 20 533-544.
[13] FARROW, D. C. (2016). Modeling the past, present, and future of influenza Ph.D. thesis Carnegie Mellon Univ.
[14] CENTERS FOR DISEASE CONTROL (2020a). COVID-19 Case Surveillance Public Use Data. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf. Data accessed on November 3, 2021.
[15] CENTERS FOR DISEASE CONTROL (2020b). COVID-19 Case Surveillance Restricted Access Detailed Data. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Restricted-Access-Detai/mbd7-r32t. Data accessed on November 3, 2021.
[16] GINSBERG, J., MOHEBBI, M. H., PATEL, R. S., BRAMMER, L., SMOLINSKI, M. S. and BRILLIANT, L. (2009). Detecting influenza epidemics using search engine query data. Nature 457 1012-1014.
[17] GOLDSTEIN, E., DUSHOFF, J., MA, J., PLOTKIN, J. B., EARN, D. J. and LIPSITCH, M. (2009a). Reconstructing influenza incidence by deconvolution of daily mortality time series. Proc. Natl. Acad. Sci. USA 106 21825-21829.
[18] GOSTIC, K. M. (2020). Practical considerations for measuring the effective reproductive number, \[{R_t}\]. PLoS Comput. Biol. 16 1-21.
[19] HAWRYLUK, I., HOELTGEBAUM, H., MISHRA, S., MISCOURIDOU, X., SCHNEKENBERG, R. P., WHITTAKER, C., VOLLMER, M., FLAXMAN, S., BHATT, S. et al. (2021). Gaussian process nowcasting: application to COVID-19 mortality reporting. In Conference on Uncertainty in Artificial Intelligence.
[20] JAHJA, M., CHIN, A. and TIBSHIRANI, R. J (2022). Supplement to “Real-Time Estimation of COVID-19 Infections: Deconvolution and Sensor Fusion.” https://doi.org/10.1214/22-STS856SUPP
[21] JAHJA, M., FARROW, D., ROSENFELD, R. and TIBSHIRANI, R. J. (2019). Kalman filter, sensor fusion, and constrained regression: Equivalences and insights. In Advances in Neural Information Processing Systems.
[22] JOHNSON, N. A. (2013). A dynamic programming algorithm for the fused lasso and \[{L_0}\]-segmentation. J. Comput. Graph. Statist. 22 246-260.
[23] KAPLAN, E. L. and MEIER, P. (1958). Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53 457-481. · Zbl 0089.14801
[24] KASS-HOUT, T. A. and ALHINNAWI, H. (2013). Social media in public health. Br. Med. Bull. 108 5-24.
[25] KASS-HOUT, T. A. and ZHANG, X. (2011). Biosurveillance: Methods and Case Studies. CRC Press, Boca Raton.
[26] LEUBA, S. I. (2020). Tracking and predicting U.S. influenza activity with a real-time surveillance network. PLoS Comput. Biol. 16 1-14.
[27] MCDONALD, D. J., BIEN, J., GREEN, A., HU, A. J., DEFRIES, N., HYUN, S., OLIVEIRA, N. L., SHARPNACK, J., TANG, J. et al. (2021). Can auxiliary indicators improve COVID-19 forecasting and hotspot prediction? e2111453118. To appear, PNAS.
[28] MCGOUGH, S. F., JOHANSSON, M. A., LIPSITCH, M. and MENZIES, N. A. (2020). Nowcasting by Bayesian smoothing: A flexible, generalizable model for real-time epidemic tracking. PLoS Comput. Biol. 16 e1007735.
[29] MCIVER, D. J. and BROWNSTEIN, J. S. (2014). Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. PLoS Comput. Biol. 10 e1003581.
[30] OPPENHEIM, A. V. and VERGHESE, G. C. (2017). Signals, Systems and Inference. Pearson, Upper Saddle River.
[31] PAUL, M. J. and DREDZE, M. (2017). Social monitoring for public health. Synth. Lect. Inf. Concept. Retr. Services 9 1-183.
[32] RADIN, J. M., WINEINGER, N. E., TOPOL, E. J. and STEINHUBL, S. R. (2020). Harnessing wearable device data to improve state-level real-time surveillance of influenza-like illness in the USA: A population-based study. The Lancet Digital Health 2 e85-e93.
[33] RAMDAS, A. and TIBSHIRANI, R. J. (2016). Fast and flexible ADMM algorithms for trend filtering. J. Comput. Graph. Statist. 25 839-858.
[34] REICH LAB (2020). The COVID-19 Forecast Hub. https://covid19forecasthub.org.
[35] REINHART, A., BROOKS, L., JAHJA, M., RUMACK, A., TANG, J., AGRAWAL, S., SAEED, W. A., ARNOLD, T., BASU, A. et al. (2021). An open repository of real-time COVID-19 indicators. Proc. Natl. Acad. Sci. USA 51 e2111452118.
[36] ROSENFELD, R. and TIBSHIRANI, R. J. (2021). Epidemic tracking and forecasting: Lessons learned from a tumultuous year. Proc. Natl. Acad. Sci. USA 51 e2111456118.
[37] RUDIN, L. I. and OSHER, S. (1994). Total variation based image restoration with free local constraints. In International Conference on Image Processing 1 31-35.
[38] SALATHÉ, M., BENGTSSON, L., BODNAR, T. J., BREWER, D. D., BROWNSTEIN, J. S., BUCKEE, C., CAMPBELL, E. M., CATTUTO, C., KHANDELWAL, S. et al. (2012). Digital epidemiology. PLoS Comput. Biol. 8 1-3.
[39] SALOMON, J. A., REINHART, A., BILINSKI, A., CHUA, E. J., LA MOTTE-KERR, W., RÖNN, M. M., REITSMA, M., MORRIS, K. A., LAROCCA, S. et al. (2021). The COVID-19 trends and impact survey: Continuous real-time measurement of COVID-19 symptoms, risks, protective behaviors, testing and vaccination. Proc. Natl. Acad. Sci. USA 51 e2111454118.
[40] SANTILLANA, M., NGUYEN, A. T., DREDZE, M., PAUL, M. J., NSOESIE, E. O. and BROWNSTEIN, J. S. (2015). Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS Comput. Biol. 11 e1004513.
[41] SANTILLANA, M., NGUYEN, A. T., LOUIE, T., ZINK, A., GRAY, J., SUNG, I. and BROWNSTEIN, J. S. (2016). Cloud-based electronic health records for real-time, region-specific influenza surveillance. Sci. Rep. 6 1-8.
[42] SMOLINSKI, M. S., CRAWLEY, A. W., BALTRUSAITIS, K., CHUNARA, R., OLSEN, J. M., WÓJCIK, O., SANTILLANA, M., NGUYEN, A. and BROWNSTEIN, J. S. (2015). flu near you: Crowdsourced symptom reporting spanning 2 influenza seasons. Am. J. Publ. Health 105 2124-2130.
[43] SYSTROM, K., VLADEK, T. and KRIEGER, M. (2020). Rt.live. https://github.com/rtcovidlive/covid-model.
[44] TAYLOR, H. L., BANKS, S. C. and MCCOY, J. F. (1979). Deconvolution with the \[{\ell_1}\] norm. Geophysics 44 39-52.
[45] THOMPSON, R. N., STOCKWIN, J. E., VAN GAALEN, R. D., POLONSKY, J. A., KAMVAR, Z. N., DEMARSH, P. A., DAHLQWIST, E., LI, S., MIGUEL, E. et al. (2019). Improved inference of time-varying reproduction numbers during infectious disease outbreaks. Epidemics 29 100356.
[46] Tibshirani, R. J. (2014). Adaptive piecewise polynomial estimation via trend filtering. Ann. Statist. 42 285-323. · Zbl 1307.62118
[47] TIBSHIRANI, R. J. (2020). Divided differences, falling factorials, and discrete splines: Another look at trend filtering and related problems. arXiv:2003.03886.
[48] VIBOUD, C. (2014). Demonstrating the use of high-volume electronic medical claims data to monitor local and regional influenza activity in the US. PLoS ONE 9 1-12.
[49] WALLINGA, J. and LIPSITCH, M. (2007). How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. Lond., B Biol. Sci. 274 599-604.
[50] WIENER, N. (1964). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. MIT Press, Cambridge.
[51] YANG, S., SANTILLANA, M. and KOU, S. C. (2015). Accurate estimation of influenza epidemics using Google search data via ARGO. Proc. Natl. Acad. Sci. USA 112 14473-14478.
[52] YANG, C.-Y., CHEN, R.-J., CHOU, W.-L., LEE, Y.-J. and LO, Y.-S. (2019). An integrated influenza surveillance framework based on national influenza-like illness incidence and multiple hospital electronic medical records for early prediction of influenza epidemics: Design and evaluation. J. Med. Internet Res. 21 e12341
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.