Learning from Data
Lessons from the Coronavirus Pandemic
The Coronavirus pandemic has generated a level of media attention to data rarely seen. Data is pouring in from all parts of the World and various models are using that data to report likely outcomes in terms of cases, hospitalizations, and deaths. Government officials use those models to order social restrictions, dial those restrictions up and down, and avoid panic. There is a surprising amount of misleading information coming from the media based on the data and models. This provides a valuable lesson in how to use data and models in the business world. Following are observations on the Coronavirus situation with the associated business lesson.
- Data on deaths is more reliable than data on cases.
- Choose reliable data sources, “garbage in, garbage out.”
- However, it is a lagging indicator.
- Are you seeking information on past performance or attempting to anticipate future conditions?
- It is probably less inaccurate to estimate cases from reported deaths and mortality rate assumptions than from reported/confirmed cases.
- Understand relationships between measures and be creative in identifying indirect ways to measure what is important.
- Data on deaths from authorities in some jurisdictions is now being reported by date of death rather than reported date. This creates a highly misleading trend line as deaths by date for the last week or more is understated.
- Data that is known to be significantly incomplete should never be added to a trendline that begins with accurate data.
- Comparing countries (or states) on total cases or deaths should be done on a per capita basis
- Frequency or seriousness of an event is dependent upon group size. Don’t get carried away by “big” numbers.
- Comparing states (or countries) even on a per capita basis is not necessarily an indicator of how well leaders are managing the pandemic. For example, New York state is much densely populated than Wyoming. Even without restrictions Wyoming would have a lower transmissibility than New York.
- Comparing performance across locations or units must take into account other variables that impact performance.
- There is less difference in the various models being reported on than in the assumptions they are using. A simple model can be constructed in Excel using current data (used in recognition of the above points), assumptions regarding transmissibility and mortality rates, and use of the herd immunity formula.
- Don’t put all your effort into model development when the quality of the inputs is more important.
- For example, the table below from an Excel file enables one to see how the virus will play out in the US under various assumptions.
In the Excel model itself, the variables highlighted in yellow can be changed. For any assumption regarding the Mortality Rate, outcomes can be viewed across a range of assumptions regarding the R0 (transmissibility).
Single point forecasts from any model are dangerous. Models such as this enable visibility of a range of outcomes for critical variables across a range of assumptions. This enables you to prepare for the uncertainty that is the reality of our most challenging situations.
 To request a copy of this Excel file, email email@example.com.