Flood of Coronavirus Data Overwhelms Washington's Disease-Reporting System, Leading to Lag in Data

Posted

Even as the novel coronavirus has spread from Washington's cities to its small towns and rural communities, Department of Health (DOH) officials acknowledged Tuesday they're wrestling with another thorny problem from the pandemic: A flood of case data is overwhelming the disease-reporting system.

As a result, statewide public reporting of new coronavirus cases has ground to a halt. As of Tuesday, Washington's last data update posted to the DOH website came from lab results reported three days earlier -- as of midnight March 28.

The lag, in turn, has partially blinded both health officials and the public to the latest information about the disease's spread statewide.

Washington state Secretary of Health John Wiesman acknowledged that timely and accurate information is critical in formulating public health strategies, but he tempered concerns about the data problems, and said they've had no adverse impacts to the state's response -- yet.

"Our team is on it. It's a top priority for us, and we understand the importance of having the data both for our use and for transparency to the public," Wiesman said in a phone interview with The Seattle Times.

"But it's not changing our strategy or what public health actions we need to take at the moment," he said. "Obviously, we don't want this to go on for very long -- that wouldn't be good."

Wiesman and Dr. Kathy Lofy, the state's health officer, each said Tuesday the data obtained to date has given officials a solid understanding that the coronavirus outbreak has spread statewide, beyond just the population centers in King and Snohomish counties. Social-distancing strategies, as those ordered by Gov. Jay Inslee, remain the best way to mitigate the spread, they said.

Officials typically only see significant trends by analyzing at least a week's worth of data -- not the two or so days that they're missing now, Lofy added. The longer periods of data coincide with the virus's incubation period, which is typically five to seven days, but can be up to two weeks.

"It's just been since Sunday, so the past two days that we haven't received [the data]," Lofy said. "Obviously, it's really important to us. But it's the longer-term trend that we're really looking for" when making policy decisions.

"Overwhelmed the system"

For more than a week, late, incomplete and inaccurate data reports have hampered the health department's daily coronavirus case updates, leading to a steady stream of complaints in public forums, questions from journalists about numbers that don't add up and differences in the number of positive tests and confirmed cases.

Officials blame the troubles largely on a volume issue. As Washington's coronavirus testing has ramped up dramatically over the past month -- with 21 labs in Washington now testing patient samples -- thousands of new data reports, with positive and negative test results, have flooded into the Washington Disease Reporting System (WDRS).

The system, however, wasn't set up to take negative test results, Wiesman said. Before the coronavirus pandemic, the reporting system usually would receive "just positive results -- say from someone who got E.coli in one county, or a case of measles in another," he said.

"With coronavirus, we're now importing data from all of the negative testing," Wiesman said. "Early on in the outbreak, when the testing was limited, that wasn't a problem. But now we're getting as many as 6,000 to 7,000 test results -- both positive and negative -- each day. That has kind of overwhelmed the system."

Negative results, which make up about 93% of the data, are streaming in without corresponding information for the origin of those results, so they cannot be assigned to any particular county, Wiesman said.

"Sometimes, these are getting sent in multiple times from the same labs on the same day," he added.

Ferreting out duplicate reports is a manual process that often requires validating data through local public health departments, he said. As data volume has increased, that manual process has greatly slowed down the state's abilities to regularly update or provide complete, accurate data, Wiesman said.



"One day last week, we had 2,o00 duplicates alone that we had to go through by hand to ensure we had reliable numbers," he said.

To fix the problem, the state has enlisted Conduent, the vendor of the system's Maven software, to essentially re-write its proprietary code on the fly. The goal will be to separate positive and negative testing data, possibly by creating a new tool for the negative cases, said Jennifer McNamara, DOH's chief information officer.

Other states, most notably New York, also use the same reporting system, McNamara said. "We have been in touch with our colleagues in New York, and they are running into the same challenges that we are," she said. "So they've been sharing with us some ways to minimize disruptions."

One of New York's suggestions was to try to streamline reporting by individual labs by limiting the number and timing of their daily reports, she said.

"We're hopeful that by tomorrow, we'll be able to report numbers that we can be comfortable with in terms of accuracy of the data," McNamara said Tuesday.

Once the fixes are made, Wiesman said he expects the state will publish updated data every day by 3 p.m., listing one full day of data reported from the previous day. The state's latest data update reported 4,896 positive cases statewide and 195 deaths -- figures that are now at least three days stale.

Along with the positive and negative test results, the state also intends to include daily death tolls, patient gender and age information and hospitalization data when it becomes available, he said.

Already, the DOH site has made several upgrades, including new visualizations showing confirmed cases, the epidemiological curve, cumulative case and death counts, testing numbers and demographic information, he said.

Flattening the curve

Even when the fixes are made, Wiesman noted "every data set has limitations."

Dr. Jared Roach at the Seattle-based Institute for Systems Biology, said the state data is prone to have some inconsistencies because it's being reported daily.

For instance, some confirmed cases may have been reported on the date of a positive result, while others on the day the test was taken. Some cases were reported in the county where a patient tested positive, while others were reported in a county of residence.

Researchers relying on state data build that uncertainty into their models, Roach said. Scientists evaluating the "flattening of the curve," for instance -- evidence that social-distancing and other measures are slowing infection rates -- create an upper and lower bound that show how many hospital beds might be needed as the pandemic progresses.

Modeling to estimate when Washington's outbreak will peak, like that from The Institute for Health Metrics and Evaluation (IHME) at the University of Washington, base trajectories on death data, not positive or negative cases, Lofy said. That means IHME's model, which estimates the Washington cases will surge to an apex about April 19, likely wouldn't be affected by a lag in reported test results, she said.

"Most people who die are probably getting tested for COVID-19," she said, "so we think the testing [for deaths] is probably fairly complete" to date, with less of a lag issue for daily positive and negative test results.

But Lofy acknowledged Tuesday that lapses in available testing, which have haunted surveillance of the pandemic's spread in Washington and across the U.S., has stymied understanding -- and continues to.

"We've seen dramatic increases in numbers of cases detected just recently" across the state, she said. "It's hard to tell if we're seeing a surge in COVID-19 activity, or if we haven't been testing enough. So the testing capacity issues we've struggled with certainly does make it more difficult to interpret the data we have now that we have more broad testing."