DeDeo is not the only researcher grappling with these challenges. Across every discipline, data sets are getting bigger and more complex, whether one is dealing with medical records, genomic sequencing, neural networks in the brain, astrophysics, historical archives, or social networks. Alessandro Vespignani, a physicist at Northeastern University who specializes in harnessing the power of social networking to model disease outbreaks, stock market behavior, collective social dynamics, and election outcomes, has collected many terabytes of data from social networks such as Twitter, nearly all of it raw and unstructured. “We didn’t define the conditions of the experiments, so we don’t know what we are capturing,” he said.