Since you’re reading this blog, you’ve probably heard of network science and big data by now. It’s the field of research in which scientists leverage the amazing amounts of data we have these days to understand the world’s myriad networks, be they social, genetic or even transportation-based (ie., the network of airline flights across the globe). Understanding them can lead to a variety of outcomes, from preventing the spread of an epidemic through the population to predicting the next American Idol.
Through this work, an entirely new field is emerging and a different breed of researcher will have to grow up with it, said professor Alessandro Vespignani. “There’s no way back. When you invent the microscope, then you have to use it, even if the optics aren’t perfect yet.”
Last month, Vespignani and a group of researchers from Penn State, Harvard and other institutions around the globe published a review paper in the journal PLOS Computational Biology called Digital Epidemiology.
This is the name they’ve given to the emerging field, which has been developing over the last five years as a result of the data influx coming from new media and digital electronic devices.
“The web is so pervasive in the lives of everybody today. It’s not anymore a new technology,” said Vespignani. “What is new is the science we can do through it.” That “future breed” of researchers, say the authors, will need to be familiar not only with advanced computational techniques and analytical processes, but also with classical epidemiology.
The idea for the paper, which also includes NIH program manager Patricia Mabry in the authors list, came about after the Data Science & Epidemiology workshop, organized by first author Marcel Salathe, last October. The workshop was held at Penn State’s Center for Infectious Disease Dynamics.
“This was one of the moments in which we discussed where we are in the field and what we can do more of,” said Vespignani. “There’s a wave of activity. And one way or another, these new capacities, technologies and approaches are going to really make a revolution in the end.”
The field of epidemiology is well established and uses techniques that are in many cases 100 years old. And while we cannot throw out those methods entirely, the “gold mine of data” on which we now sit requires the exploration of new, previously unaskable questions. In classical studies, 500 or 1000 subjects make for a substantial cohort. When scientists probe online chat rooms, Twitter feeds and cell phone data, they can access populations of millions of people. What these data lack in controllability and “cleanliness,” they make up for with the power of size, said Vespignani.
But what does it all look like, exactly? How can you do an epidemiological study without going door-to-door to ask people things like whether or not they will get their flu vaccination this year or how many cigarettes they smoke? Vespignani and the other authors envision large-scale, web-based epidemiological studies in which volunteers field questions about things like obesity and cigarette smoking to collect data about peoples’ perceptions and behaviors. Co-author John Brownstein, an associate professor at Harvard Medical School and director of the Computational Epidemiology Group at Children’s Hospital in Boston, has a project called “Flu Near You,” which tracks the flu virus by asking participants to answer a short online survey once a week.
When they track the spread of infectious diseases across the globe, Vespignani’s group uses the networks of airline flights and cell tower activity to map out human mobility patterns. When you use the social network to track things like the spread of health knowledge, you’re just looking at a different kind of map, Vespignani explained. “The model might use the same metaphor,” he said, “but it’s in a different space – a social space, instead of a geographical one.”
The emerging field of digital epidemiology may not be perfect yet, it still needs some fine-tuning, but in the end, Vespignani and his colleagues believe that it is the future epidemiology.