April 27, 2020

What's curve got to do with it?

If you have access to the internet and interact on any social media platform, we’re sure you’re well aware of how engaged Oklahomans are in responding to the impact Coronavirus is having on our state. We’re also eager to better understand the sometimes conflicting data interpretations that leave our heads spinning around.

We’ve been hearing a lot about flattening the curve, using data to make decisions about when to reopen Oklahoma, and the importance of consulting with public health experts to interpret scientific models. 

These words are music to our ears as Metriarch support staff, as we’ve been tracking how data is being harvested at the community level and made available to the public by state and local health departments. We salute the many government agencies that have been so committed to pulling this critical data together, and can’t even imagine what their day-to-day lives look like right now. They are rockstars.

Live footage from health departments across the state:

Interestingly, we’ve also noticed a whole lotta discrepancies in how people, including some public figures, have been interpreting the data, and we’ve had to take a few steps back to consider what we’re really looking at. If you’ve been confused about what the heck all of this data means, you’re not alone. Even those of us who work with data every day have had to take a second, or third look as data models fly at us from every direction. 

Making sense of the numbers

Lucky for everyone, most of the charts and graphs that are guiding policy decisions can be interpreted with simple statistics you probably picked up in high school. 

Did you fall asleep or skip class? We got you. The vast majority of data visualization (AKA graphs) that track the trajectory (path) of the virus can be understood with just a tiny bit of unpacking. Below are some pointers to help you demystify the data that is determining the collective fate of Oklahomans.

Here's how to read the curve:

How can we use this curve to visualize data about Coronavirus? We can use a bell curve when we have all of the information we are measuring (real or projected) to illustrate a trend over time. When the trend line goes up, we’ve got more cases. When it goes down, we’ve got fewer cases. 

When the curve gets flat, we are still having new infections of the Coronavirus, just not as many as before. If it’s flat like a fried egg or a pancake, that’s a really good thing, because it means that prevalence plateaued at a relatively low case number, and stayed there. When it plateaus when its high, that means that the rate of transmission is still relatively high, just not climbing.

Additionally, curve flattening isn’t as straightforward as we would hope. There is a good chance that the reason the curve is plateauing is due to access to testing rather than to the actual prevalence of the virusand other data-collection related challenges that complicate accuracy of projections. Leaving any mitigating factors to the side, the bell curve is supposed to tell us if the situation is getting better or worse.

Where things get magical (Or perhaps mathical?): If you have a string of data that is all collected the same way, and there are enough data points, you can make a projection, or predictive model. A bell curve can only be used to visualize predictive data when it uses math to ASSUME that certain variables will behave in certain ways, and can therefore fill in the blanks with the rest of the data. In the case of COVID-19, we assume that the public will, at minimum, follow social distancing orders to arrive at a flatter curve.

Can we predict the future?

A “prediction” is a forecast that can (and often will) change as new, real data is collected. The projections that health experts are using today are largely based on patterns that were seen in China, but we don’t know for sure how the virus will react in a new population.

As better quality data comes out (newer, specific, complete, accurate, more numerous) this will help predictive models in the United States become more accurate. This will increase the likelihood, or probability, of any certain forecast to be accurate, but predictions are never 100% accurate. Better data just makes an outcome more likely. 

Here’s one way of thinking about it: Have you ever checked out a weather forecast several days ahead of time, prepared for your week of bad weather at predicted times, only to get blindsided by a torrential downpour when it was supposed to be sunny?

Maybe something unexpected happened with an impending storm? Maybe the source where you checked the weather had some funky models? More likely than those explanations, however, is that you checked the weather report too far ahead of time, so the prediction wasn’t quite right.  As the week progressed, new data would have come out that would have given a better forecast.

How predictive models tell data what to do:

These days, weather predictions are pretty consistent (we’ve had a lot of practice modelling the weather), but there can still be some variance in reporting authorities depending on what predictive models that they use. This is true for all industries. The potential for different predictive models is especially relevant to COVID-19 forecasting, because different states are tracking different metrics

All of the predictive models should be using similar assumptions and parameters pulled from what we’ve learned from cases around the world, and will become more sophisticated when we get newer, more local US data. 

Even as better data increases predictive power, there can still be profound differences in case predictions across the nation. If you’re wondering how this is possible, it’s time to reunite you with the predictive model that YOU created when you were in grade school:

You've done predictive modeling.

Remember Algebra? What about finding the “slope” of a line? Well, that my friends, is a predictive tool that helped you create a predictive model with the data you had available, and would change when you added new X and Y intercepts. The equation that you wrote out predicted the rate of change over time. 

Slap a bunch of calculus and social science on top of that, and you’ve got yourself a more sophisticated predictive tool, or set of tools that becomes a “predictive model.” Is that an oversimplification? Maybe, but the devil is in the details, not the overarching concept of what a model actually is. 

The predictive model that lots of epidemiologists use to track virus spread is called the SEIR model. This modeling method tracks the flow of individuals through four stages: susceptible (S), exposed (E), infectious (I) and recovered (R). Oklahoma State Department of Health (OSDH) is tracking some of these metrics in their weekly briefs, but not all of the data that is needed for SEIR  modelling is available. 

This is understandable since they can’t model with data that doesn’t currently exist, and won’t exists until we have serological testing available (test for exposure). What is less understandable, however, is why some public officials are referencing snippets of data during their briefings when more comprehensive data exists. 

Here’s a good question: How do you feel about others making predictions that will affect your life without using all of the information available? 

 

Now let's have some fun with probability and predictions!

Imagine you’re in the passenger seat of your thrill-seeking friend’s car. They hate math, and don’t think angle and speed have anything to do with auto wrecks. Fake news. 

As they increase the car speed while rounding a mountainside, you inch closer, and closer (probability increasing!) of getting flung off the edge! Instinct kicks in, and you start to scream. 

Your shrieking works. They slow down juuuust enough to keep you stuck to the road. Seeing the close call,  your friend self-congratulates, “I’m a Pro! Good thing I avoided that cliff!” While you sit there thinking hard about your choice of friends. 

What would happen if your friend suddenly felt feisty and decided to speed back up? Increase your probability of flying off a cliff? Maybe they only increased the speed a little? Would that be okay?  Do you really like this person enough to take a chance at a Thelma-and-Louise-style plunge off of the side of the cliff?

Our guess is, no. This person is no longer your friend, and you are getting out of the vehicle.  It’s too risky. You don’t want to gamble on your life with someone who doesn’t know the odds, someone who doesn’t know how likely it is that they could harm you.  

This might be a helpful analogy when you consider policies about reopening the Oklahoma economy if officials don’t understand the possible outcomes of their decisions.  

Pro Tip: To use this analogy one last time: you can replace “increase speed,” with “start socializing” in your mental model,  as these are the variables in this equation that increase the probability of harm.  

Having people socialize again is basically the same thing as pressing your foot on the gas again.  If the thing that is causing harm increases (speed/socializing), the predictive model will change and the curve will begin to turn up again. 

Breaking Down Oklahoma Data

Now let’s think about predictions and probability as it relates to reopening Oklahoma. Do you think there is enough of a decrease shown in the data below to reopen? Why or why not? Let’s apply the logic outlined above as we analyze the data. 

 


The chart below was captured on Governor Stitt's Twitter account from 4/22/20. 

https://twitter.com/govstitt/status/1253162788823937030?s=21

The visualization below: is interactive and based on data scraped from the Oklahoma State Department of Health website by the Coronadatascraper Project and members of the Metriarch support team. Click on your county to see what the trend looks like in your community, and click on any data point to learn about what we mean by “seven day total.”

Seven day totals for all Oklahoma counties

What does the trend look like for your county?  Do you see any peaks? A plateau? How defined is the peak? How steep is the curve? Does it look like an egg or a pancake? Use your data skills to analyze what you are looking at.

Pro Tip: There seems to be little dips and increases each day (a thing called periodicity) which is likely due to unpredictable or unpreventable factors, like when people seek treatment (weekend vs weekday) or how a hospital administration reports their records. Averaging for 7 days (one week) for each data point helps level this out some. 

Now let’s play with some interactive Oklahoma data from your county. Flip horizontally if on a smart phone and click on your county to see rates from your neck of the woods:

Daily increases of coronavirus cases for all Oklahoma Counties

Hover over content in this dashboard to learn more about daily increases in total COVID-19 cases since March 24.

 

To view county-level numbers, click on a county in the map portion of the view to filter the dashboard to the selected county. To clear, click anywhere in the white space around the map.

How do things look for your county? When was your county’s peak day? How do policy decisions related to reopening Oklahoma effect your county based on the data provided?

 Statewide, it appears that we reached a peak around April 4th, but experienced the 6th and 7th highest increases in the days leading up to the announcement on the state’s reopening plan.

While the use of data is said to have factored into  decisions about reopening, specifically data that shows that we have reached our peak, we know that there is nothing to say we couldn’t reach another peak, especially if we stop social distancing. Until the future unfolds, we’ve got to remember that a peak we may see for our state or county might be just that: “a” peak. Not “the” peak. 

Oklahoma's Gonna Be Okay

As much as we’d all love a little certainty right now, we don’t know what our actual curve will look like until this whole experience is said and done. What we do know is that the projections we are using to gauge our return to normal life are only viable if we continue to follow the orders (social distancing, safer at home, shelter in place) that caused the decline in COVID-19 cases in the first place.

We’re proud of the foresight shown by Oklahoma’s metropolitan mayors to keep our cities safer, and we are elated to see Oklahoma State Department of Health’s weekly updates for COVID-19 reporting, which will bolster the public’s ability to stay informed about pandemic trends.

Even if we can’t gather in person to show our commitment to keeping our communities safe, Oklahoman’s are showing up for one another whether than be howling at the moon, organizing via Zoom, or the countless creative ways we’ve grown closer together despite the distance. Growing strength in our community is a trend we can get behind.