Epidemics can and should be forecast, to improve decision making by governments, institutions and individuals. The goal of the Delphi group at Carnegie Mellon University is to make epidemiological forecasting as universally accepted and useful as weather forecasting is today. I will describe the statistical forecasting system we developed, which won several forecasting competitions including the most recent flu prediction competition run by CDC. It includes both Bayesian and frequentist components, and careful attention to the details of the surveillance signal.
A critical open problem in epi-forecasting is nowcasting – real-time estimation of epidemic intensity. Google search query fraction and other real-time sources have been used to try to nowcast epidemic intensity, with mixed results. Better results might be achievable by combining many such proxies: electronic health records, internet search queries, social media, relevant retail purchases, and online information-seeking behavior. However, these sources are correlated, dynamically changing, sporadic, of variable geographic and temporal resolution, of variable reporting delays, and have variable historical training data. Conventional statistical estimation techniques are inadequate for dealing with these challenges, resulting in a variety of ad-hoc approaches. I will describe a novel method for nowcasting that is derived from sensor fusion theory. It combines an arbitrary number of heterogeneous and sporadic sources to produce real-time estimates. Unlike regression-based methods, it is robust to missing data and follows a disciplined approach to combining sources of variable geographic resolution. I will demonstrate our method on a large set of both traditional and novel digital flu surveillance sources, and provide a retrospective analysis of its performance, including ablation experiments showing its robustness. In the interest of ongoing research and improved preparedness, we have been publishing our ongoing nowcasts in real-time on our website.
If there is time and interest, I will also discuss a second forecasting system we developed, based on the "wisdom of crowds" method, which has proven equally competitive.
Joint work with many people in the Delphi research group, including Ryan Tibshirani, David Farrow, Logan Brooks, Sangwon Hyun, and Aaron Rumack.