×

Seven principles for rapid-response data science: lessons learned from COVID-19 forecasting. (English) Zbl 07535203

Summary: In this article, we take a step back to distill seven principles out of our experience in the spring of 2020, when our 12-person rapid-response team used skills of data science and beyond to help distribute 340,000+ units of Covid PPE. This process included tapping into domain knowledge of epidemiology and medical logistics chains, curating a relevant data repository, developing models for short-term county-level death forecasting in the US, and building a website for sharing visualization (an automated AI machine). The principles are described in the context of working with Response4Life, a then-new nonprofit organization, to illustrate their necessity. Many of these principles overlap with those in standard data-science teams, but an emphasis is put on dealing with problems that require rapid response, often resembling agile software development. The technical work from this rapid response project resulted in a paper [N. Altieri et al., “Curating a COVID-19 data repository and forecasting county-level death counts in the United States”, Harvard Data Sci. Rev., Spec. Issue 1, 82 p. (2021; doi:10.1162/99608f92.1d4e0dae)]; see also this interview for more background [B. Yu and X.-L. Meng, “An interview with Bin Yu”, Harvard Data Sci. Rev., Spec. Issue 1, 12 p. (2021), https://hdsr.mitpress.mit.edu/pub/5pe5xcvb].

MSC:

62-XX Statistics
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] ALTIERI, N., BARTER, R. L., DUNCAN, J., DWIVEDI, R., KUMBIER, K., LI, X., NETZORG, R., PARK, B., SINGH, C. et al. (2021). Curating a Covid-19 data repository and forecasting county-level death counts in the United States. Harvard Data Science Review. https://hdsr.mitpress.mit.edu/pub/p6isyf0g. · doi:10.1162/99608f92.1d4e0dae
[2] COCKBURN, A. and HIGHSMITH, J. (2001). Agile software development, the people factor. Computer 34 131-133.
[3] Pedregosa, F., Varoquaux, G., Gramfort, A. et al. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 2825-2830. · Zbl 1280.68189
[4] SCHULLER, G. D., YU, B., HUANG, D. and EDLER, B. (2002). Perceptual audio coding using adaptive pre-and post-filters and lossless compression. IEEE Trans. Speech Audio Process. 10 379-390.
[5] SEABOLD, S. and PERKTOLD, J. (2010). Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference, Austin, TX 57 61.
[6] SINGH, C., NASSERI, K., TAN, Y. S., TANG, T. and YU, B. (2021). (2021). imodels: a python package for fitting interpretable models. Journal of Open Source Software 6 3192. · doi:10.21105/joss.03192
[7] Vovk, V., Gammerman, A. and Shafer, G. (2005). Algorithmic Learning in a Random World. Springer, New York. · Zbl 1105.68052
[8] YU, B. and MENG, X.-L. (2021). An interview with Bin Yu. Harvard Data Science Review
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.