Face value

Facebook is home to nearly 3 billion photos. Every minute, YouTube grows by another 100 hours of video. And, according to IHS Research, some 30 million surveillance cameras pepper our public spaces, collecting nearly 4 billion hours of footage each week. Needless to say, there’s a lot of image data that’s ripe for the picking.

Content like this helped break criminal cases such as the 2013 Boston Marathon bombing. But if we want to carry on with similar successes, we’ll need ever more sophisticated algorithms to parse the data deluge.

For his part, Northeastern University assistant professor Raymond Fu is working to improve the current state-of-the-art of biometrics software, which automatically distinguishes between different categories of people as well as between individuals themselves.

Fu’s research recently earned him one of two Young Investigator awards from the International Neural Network Society in 2014. “This is a real honor and inspires me to keep up the good work,” said Fu, a machine-learning expert who holds joint appointments in the College of Engineering and the College of Computer and Information Science.

Backed by funding from Samsung Research of America, the research and development arm of the international electronics company, Fu has recently begun developing visual recognition software for use on social media networks such as Facebook and Twitter.

“When people share facial images on social networks, those images are in the wild. So you have unconstrained data—meaning it’s not collected in the lab under controlled conditions,” said Fu. “It can be from multiple cameras, multiple resources, so the data has a lot of variables.”

To circumvent this problem, his algorithm ranks the information in all of the images and quickly tosses out any outliers. “If something is very different from the rest of the images, our algorithm can rule it out and mitigate noise,” he explained.

The software can “learn” a person’s unique face and use that information to leverage the vast stores of image data online to understand society or inform investigations. For instance, Fu’s algorithms could help identify what types of people turn out at a protest, he said, by recognizing general characteristics rather than individual ones: Are the people photographed at demonstrations such as Occupy Wall Street carrying cameras and notebooks, and thus likely journalists? Are there more uniform-donning policemen than protestors?

Of course, advertisers and corporations could also use this data for less-noble pursuits, such as targeting their products at particular groups or individuals. “There is always a trade off between privacy and services,” said Fu. “Everything I’m doing uses data that’s publically available. We’re trying to provide the best models for analyzing it.”

It’s up to the rest of us—you and me and our representatives—to determine how we should use those models.