By Bill Ibelle
Entire mountain ranges of data are growing all around, and they will either bury us or help us climb to new heights of understanding. It all depends on how we respond.
This was the focus of a four-hour “hackathon” Wednesday night, sponsored by Northeastern to explore the intersection between public policy and Big Data analysis. The event, “Data Science, Journalism, and the Future of Justice,” was part of HUBweek, a series of more than 100 events that brought together the brightest minds in government, private industry, and academia to celebrate innovation in Boston.
“Our goal was to explore the convergence of data science in a variety of fields,” said assistant professor of journalism John Wihbey, who hosted the event. He noted that the three panelists specialize in radically different disciplines, yet all see Big Data as essential in their field.
“In the old days, all you needed was three examples and a quote, and you could report a trend,” said Todd Wallack of The Boston Globe Spotlight team and a two-time Pulitzer Prize nominee. “But today’s readers expect you to back up your stories with a thorough analysis of hard data.”
As an example, Wallack described the Spotlight team’s efforts to combine traditional reporting and Big Data analysis to expand a story about teacher-on-student sexual abuse at prestigious private high schools. By the time they were done, it had grown into a mega-story that involved more than 100 private schools in New England.
“Readers want to know the data,” he said. “They want to know how many and exactly which schools.”
Assistant professor Dan O’Brien, an expert in urban studies, analyzes Big Data to discover trends in urban crime and health in order to help cities develop more effective public policy. Computer science assistant professor Michelle Borkin, a specialist in data visualization, transforms complex data analysis into visual patterns that help guide the work of astronomers and surgeons.
“We are approaching 40 zettabytes of data,” she said to illustrate the sheer enormity of the data deluge. “That’s 40 with 20 zeros following it.”
With this in mind, more than 60 participants launched into the hackathon itself—a two-hour endeavor to make sense of one of three mountains of raw data. Each data set came from the Boston Police Department and focused on either crime rates, homicide data, or police stop-and-frisk incidents.
In the old days, all you needed was three examples and a quote, and you could report a trend. But today’s readers expect you to back up your stories with a thorough analysis of hard data.
— Todd Wallack of The Boston Globe Spotlight team and a two-time Pulitzer Prize nominee
A team led by journalism student Abby Skelton, MA’18, determined that women are less likely to be stopped-and-frisked than men of their same race. They also found that the differences vary greatly based on race. White women are 29 percent less likely to be stopped than white men, while that figure is 21 percent for black women and just 5 percent for Hispanic women.
Another group, led by mechanical engineering alum Aditya Agrawal, E’16, used the same database to determine that while the number of white and black males who were stopped was about the same, action was taken against a much lower percentage of the black males. Their conclusion: Blacks are far more likely to be stopped without adequate cause.
But given the two-hour time limit, the point of the exercise was the process, not the results.
“This was an opportunity for data geeks to meet one another, make contacts, and be energized by working in a room full of people who share their interests,” said O’Brien, who counts himself as a data geek as well. “We want to encourage a collaboration between disciplines and agencies. Our goal is to get people excited about using Big Data to gain a deeper understanding of how cities work.”