Researchers challenge federal law in attempt to prevent ‘Big Data’ discrimination

Ever wonder if online job-hunting sites are ranking you as less qualified than others based on your gender, race, or ethnicity?

Northeastern’s Alan Mislove, associate professor, and Christo Wilson, assistant professor, both in the College of Computer and Information Science, have been researching whether the proprietary algorithms that these sites use to analyze user profile data, web-browsing choices, and other online information may lead to such discrimination.

But they’ve hit a roadblock: A section of the U.S. Computer Fraud and Abuse Act, or CFAA, makes it a crime to violate a website’s terms of service. Those terms, fashioned by each individual site and open to change at any time, prohibit activities that computer researchers rely on to determine if sites are treating all users equally. These activities include creating dummy online identities, setting up multiple accounts, and using automated methods to collect publicly available data such as search results and ads.

On Thursday, in a case brought by the American Civil Liberties Union, Mislove, Wilson, and several other plaintiffs—the University of Michigan’s Christian Sandvig, the University of Illinois’ Karrie Karahalios, and the media company First Look Media—have challenged a section of the anti-hacking law as unconstitutional.

March 28, 2012 - Alan Mislove, assistant professor in College of Computer and Information Science.

Alan Mislove, associate professor Photo by Mary Knox Merrill/Northeastern University

“This section of the CFAA chills important research, which is protected by the First and Fifth Amendments—the right to free speech and the right to due process,” says Rachel Goodman, staff attorney with the ACLU Racial Justice Program. It differs from other sections of the CFAA, which apply to accessing information with the intent to cause harm, such as bullying or scalping behavior, she says. “This research is necessary for us to understand the potential for discrimination online,” she adds, citing marginalized communities, including people of color and women as particularly vulnerable.

Challenging this law is all about enabling us to do this kind of research without fear of criminal prosecution.”
— Alan Mislove

Cracking open the ‘black box’

Mislove and Wilson have dedicated their careers to investigating the sophisticated online algorithms that are increasingly influencing our lives but remain “black boxes,” as Wilson puts it. “We, as external observers, have very little visibility into these algorithms and how they are affecting us,” he says.

“Personalization,” as such algorithmic results are called, runs rampant on the internet. “If you and I were to run the same search on Google, we would each get different results,” says Mislove. The same happens on Netflix, regarding movie offerings, and Expedia, regarding travel offerings. What pops up on each of our screens is based on our individual search and ordering histories and other personal information, including public records, collected and disseminated by data brokers.


Christo Wilson, assistant professor Photo by Liz Linder Photography

Mislove and Wilson’s research pulls back the curtain on web personalization. They have quantified the extent of personalization in web searches, including on Google, and under what search conditions personalization occurs. They have studied how e-commerce sites use algorithms to steer customers to more expensive products or to customize prices, for example, showcasing a more expensive laptop to a wealthier customer. They’ve unlocked details of what drives Uber’s surge pricing—and how users can avoid it.

“These algorithms are all around us daily, and we don’t realize it,” says Wilson. They even extend into the government, including the criminal justice system. “Judges can use risk-assessment software to evaluate the risk of a defendant—a process that can influence a sentencing or bail decision,” he says. “In criminal justice, you expect to be able to challenge your accuser.” Here, he says, there is none to see.

Only by conducting research to understand how these processes work, say the pair, can they help the public navigate the web with their own best interest in mind.

“Challenging this law is all about enabling us to do this kind of research without fear of criminal prosecution,” says Mislove.