Gimme Shelter, Gimme Testing Data

On Day 2 of our Alameda County Shelter-in-place order, I am creating graphs, mostly for my sanity. Today’s topic is data, in particular, Covid-19 testing data. If you’re a data geek like me, this is for you.

I have blathered on for days (?or is it weeks? I’ve lost track…days seem like weeks) that our biggest problem right now is lack of testing. We don’t know what we don’t know. Because the U.S. didn’t roll out testing capacity early on, people who feel sick or at risk for Covid haven’t been able to get tested. We’ve heard that for weeks and are still hearing it. Because people who know they’re sick can’t get tested, we have no idea who is sick and how many would test positive. Without knowing that, everyone has to STOP moving. That’s the problem right now.

Yes, it’s definitely a problem that hospitals are starting to become overwhelmed and might become swamped. It’s definitely a problem that travel is cancelled and that there is a black market for toilet paper and sanitizers. (Anybody know where we can get some ramen? That turns out to be a big concern in our house.) It’s an even bigger problem that we don’t know how long this will last, and we won’t know until there’s a robust testing structure in place. South Korea put in an excellent testing structure early on, and they seem to be moving into a better part of the pandemic curve. We can learn something from their experience, and we can learn something looking at data.

The Most Important Data Is Under-Reported

The problem has been a lack of good data, and good testing data is still hit and miss. In a world that’s used to hitting the “refresh” button every minute and seeing numbers update, having data that is only reported every few days or not at all is killer to the psyche. Up until about a week ago, data on how many people were being tested was nearly impossible to find. This was due partly because few had been tested; I might also speculate that some didn’t want the public to know just how few that was.

I can illustrate this by looking at Daily Case data compared with Daily Testing data. Here is the number of cases in California, shown per day and total to date. By the way, note that the red bars (daily cases) are linked to numbers on the left side and the purple line (cases to date) linked to the right side. Showing data on different axes is important because if you show cumulative and daily on the same graph, the cumulative would make the daily increases too small to see. You would have no sense of the underlying infection curve.

Graph of California Covid cases
Graph by kajmeister based on data sources in COVID Tracking project.
Continue reading “Gimme Shelter, Gimme Testing Data”