Crowdsourcing COVID-19: How data-driven groups speed pandemic response

COVID-WEB lab manager Matt Metzger prepares a wastewater sample for COVID-19 testing at the University of California, Berkeley, on Dec. 23, 2020. The lab is one of dozens of regional or local efforts worldwide to track the disease through wastewater monitoring.

Hannah Salkever

February 5, 2021

Matt Metzger is hard at work running samples in Hildebrand Hall on the campus of the University of California, Berkeley. In a mere five months, this modest, 1,200-square-foot “pop up” testing lab has been transformed into one of the country’s only high-throughput facilities for measuring COVID-19 viruses in sewage water.  

Behind him sits a stack of boxes with more samples awaiting assays. A vacuum pump hums in the background. 

Each sample that Mr. Metzger pours for testing contains wastewater from a Bay Area collection site: water treatment plants, nursing homes, even San Quentin State Prison. 

Why We Wrote This

Where does creativity thrive? In the pandemic, the U.S. is seeing many examples of innovation welling up from small groups tapping data in fresh or nimble ways.

A small testing machine, about the size of a microwave oven, is able to tag COVID-19 genetic material with fluorescence and monitor the concentration of virus in each sample. 

“We can turn around samples in 24 hours. That’s about as fast as you can do right now,” says Mr. Metzger, who manages the lab, which now includes three full-time employees. “The lab is really gaining traction with public health officials.” 

Why many in Ukraine oppose a ‘land for peace’ formula to end the war

The project, called COVID-WEB, is allowing people like nursing home administrators and college chancellors around the Bay Area to ramp up testing when most needed and to identify precise groups for quarantine. Some health officials say this targeted early warning system has helped the region outperform the rest of the country and world in reducing outbreaks during the pandemic. 

This success story is part of a growing number of grassroots responses to the pandemic, in which self-forming groups or broad networks of contributors often prove more nimble than previous generations of top-down actions by public health agencies. From volunteer data collection to tapping health information from wearable devices like Apple Watches, these efforts are creating a new blueprint for pandemic response.

And in turn, it’s part of a broader societal trend of data-driven innovation emanating from the power of crowds and focused groups of individuals.  

“The implications of end users being increasingly able to innovate for themselves – rather than waiting for companies or organizations to develop innovations for them – is that users can get more exactly what they want,” says Eric von Hippel, a professor at the Massachusetts Institute of Technology and an expert in open innovation.

This past year, what many scientists and citizens have wanted is rapid response to a health crisis. 

Howard University hoped to make history. Now it’s ready for a different role.

Tim Pine and Al Sanchez, health and safety employees at the University of California, Berkeley, remove a wastewater autosampler from a sewer drain, where it had been collecting samples over a 24-hour period. When analyzed by the COVID-WEB project, the samples can be an early warning of coronavirus infections in specific communities including colleges.
Irene Yi/UC Berkeley

While the U.S. Centers for Disease Control and Prevention remains a focal point of U.S. federal government work on the pandemic, projects like COVID-WEB are making a significant contribution – often on fast timetables and serving their own localities first. 

“Joyful creativity”

The impetus currently is a sobering global emergency, but Dr. von Hippel describes this type of innovation generally as a form of “joyful creativity” driven by two major trends. One is the rise of cheap and increasingly powerful digital design tools. The other is parallel gains in the ability of people to find one another and collaborate over the internet.

Millions of Americans today fight traffic by pooling data in the Waze traffic app, noting delays and sharing new routes. Thousands of people along the West Coast of the United States have purchased networked PurpleAir sensors, creating a hyper-local real-time air quality map that became a critical tool for Californians during the summer wildfire season. User review sites like Yelp are another example.

In truth, the shift toward distributed innovation goes back long before the computer era. Harvard University researcher Yochai Benkler has tracked the transition from the model of a lone inventor at a lab bench toward networks of researchers coalescing around innovative ideas and trends. Modern technology has accelerated the shift. Today the Linux operating system, created by a collaborative of thousands of programmers, is the software that powers most of the world’s supercomputers and corporate servers.

COVID-WEB also symbolizes a wider collaboration. It’s one of dozens of recently formed regional or local efforts worldwide to track the disease through wastewater monitoring.

“When we test wastewater, we get information about a really large number of people with a very small number of samples, and we get information about asymptomatic infections,” says Kara Nelson, who leads COVID-WEB and is a professor of civil and environmental engineering at UC Berkeley. 

Equally important, she explains, is the ability to test the wastewater at specific facilities – subsets of a community like prisons or nursing homes. Early positive tests can trigger aggressive individual testing and head off a full-blown outbreak. 

Where agencies such as the CDC can be hampered by legacy systems including paper-based data collection, the local campaigns tend to rely on current digital platforms. 

“There is a network of upwards of 1,000 researchers around the world that communicate through a Slack workspace and webinars and pre-publications of research. They are sharing information at a scale I’ve never seen before,” Dr. Nelson says. 

Faster tracking

In some instances, the bottoms-up efforts are actually viewed as more reliable and trustworthy than state or federal data sources. Witness the genesis and trajectory of the COVID Tracking Project

During the pandemic’s initial spread last March, two journalists at The Atlantic, Robinson Meyer and Alexis Madrigal, built a spreadsheet tracker of COVID-19 testing rates. At the same time, data scientist and venture capitalist Jeff Hammerbacher created his own tracking spreadsheet. The two efforts joined forces and made a call for volunteers. 

Today the project counts hundreds of volunteers and frequently updates data before national, state, and local governments publish their own findings. Johns Hopkins University’s Bloomberg School of Public Health manages a similar project that digs deep into county-level and regional data sources and explanations to create more accurate analysis of COVID-19 status from the bottom up, tapping hundreds of sources of information from around the globe. 

“If we find a U.S. state that is running too high, we then drill down to the county level and we can help them understand where they need to do their precision interventions,” says Lori Post, director of the Institute for Public Health and Medicine at Northwestern University’s Buehler Center for Health Policy and Economics. The institute creates its own COVID-19 dashboard with CDC and Johns Hopkins data. 

Dr. Post says it’s the grassroots gathering and sharing of data that has made the Buehler Center’s dashboard possible. Algorithms built by the Northwestern team analyze seven-day moving coronavirus averages. 

“We don’t have a top-down system when it comes to COVID surveillance. It’s bottom up, and it’s journalists and universities that are doing it,” explains Dr. Post, a demographer who has run public health surveillance projects for over two decades. “We have one person that does almost all the work. I have APIs [software] that pull the data, so we run a completely automated process that is less labor-intensive and more accurate.”

Volunteering their data

Automated surveillance is another testing ground for sifting data. Though anathema to some for privacy reasons, with individuals’ willing participation it raises tantalizing possibilities of gathering real-time data with an accuracy that was previously unimaginable. One app-based research program called DETECT analyzes participant data from wearable devices like FitBit fitness trackers, which can monitor things like heart rates and sleep patterns. 

Participants download an app that automatically collects data from wearables. The app also asks people to answer questions about how they are feeling and sleeping, symptoms they might be experiencing, and results of diagnostic tests. 

The goal is to detect the early emergence of viral illnesses, explains principal investigator Jennifer Radin of the Scripps Research Translational Institute in San Diego, California.

“Traditional viral illness surveillance is typically delayed by one to three weeks,” she says. “Being able to have a real-time picture ... can potentially speed up our ability to respond to outbreaks and help us prevent larger spread better than we currently do.” 

Already, some early research has found promise in using wearable data.

“Researchers are exploring how all these novel data streams might fit together,” says Dr. Radin. “There is more potential collaboration that will help identify and prevent the next pandemic. The faster we can detect things, the better we will be able to react in the future.”

Taken as a whole, the distributed, data-driven efforts point toward a future where pandemic tracking is more resilient, widespread, and accurate on a global scale. The CDC is exploring a national wastewater epidemiology program. And public health experts are reconsidering what the future might look like.

“Researchers are exploring how all these novel data streams might fit together,” says Dr. Radin of Scripps Research. “There is more potential collaboration that will help identify and prevent the next pandemic. The faster we can detect things, the better we will be able to react in the future.”