Samuel Scarpino’s passion for artificial intelligence and why Northeastern is a global leader in the field

by Cynthia McCormick Hibbert

December 1, 2022

Photo by Matthew Modoono/Northeastern University

Professor Samuel Scarpino has returned to Northeastern University from a one-year stint as vice president of pathogen surveillance at The Rockefeller Foundation with the mission of reinforcing Northeastern’s place as a world leader in artificial intelligence and life sciences.

Since November, Scarpino has served as director of AI plus Life Sciences in the Institute for Experiential AI at Northeastern. Previously he was assistant professor in the Network Science Institute.

Scarpino sat down recently with News@Northeastern to answer questions about what makes Northeastern a commanding presence in the field of AI. His comments have been edited for clarity and brevity.

What about Northeastern makes the university a leader in the AI field?

The real strength of Northeastern is the high degree of importance it places on trans-disciplinary collaboration and partnerships between business and academia.

Part of the reason that Northeastern is uniquely positioned to lead in this field is the word “experiential,” which means co-ops but it also means humans in the loop. We can do so much more with AI plus humans than we can with AI alone or humans alone. The idea is that AI will pick up on things that we might miss. Humans may also pick up on things that AI might miss.

Can you give us an example?

The United Kingdom has procedures where mammograms will get scanned by AI systems. Some of those mammograms also will be reviewed by expert panels.

Samuel Scarpino, assistant professor in Northeastern’s Network Science Institute. Photo by Matthew Modoono/Northeastern University

A computer can read 10,000 mammograms without having to take a break. I don’t remember off the top of my head how many an expert pathologist can read before they have to take a break, but it’s not very many, right?

Former U.K. Prime Minister Theresa May said AI could reduce cancer deaths by 10% annually.

AI isn’t a silver bullet, but I would certainly sign up to prevent 10% of cancer deaths. For the ultimate clinical diagnosis, the clinician would have both the AI reading and potentially the input of the expert diagnostic panel.

You’ve talked about the importance of AI in predicting disease states and the tragedy of the existence of anonymized health data not being put to clinical use. How could it be better used? And what are the privacy implications?

I’m not an expert on privacy. Part of the reason I joined the Institute for Experiential AI at Northeastern is that it has experts on privacy and ethics.

The institute’s multidisciplinary approach is a big reason why I expect Northeastern to be a leader in the field of AI. You don’t want to end up with data you can’t use or in a bunch of regulatory or ethical trouble.

Northeastern understood it’s an interdisciplinary problem and went about solving that from the beginning.

An example of the problems with unusable data is a Centers for Disease Control and Prevention household study of pertussis that began in 2015 and ran for several years. The nasal swabs results were never sequenced.

Those bacteria could hold clues about asymptomatic transmission of pertussis. Sitting in minus 80 degree temperature freezers in Atlanta is very probably the answer to one of the biggest open questions about one of the biggest child killing diseases left on the planet. And it can never be used unless somebody goes back to these 4,000 households and gets permission to use the data.

You don’t want data to be open and abused. You also don’t want it to be abused in the sense that the cost, effort, energy and privacy invasion—which could include blood draws and spinal taps—that went into collecting data means it can’t actually be used for anything going forward.

Dr. Naveen Rao, senior vice president of The Rockefeller Foundation Health Initiative, said that the foundation is “grateful for the passion, leadership and innovation” you brought to its Pandemic Prevention Initiative and looks forward to “continuing our partnership to strengthen pandemic prevention and response across the global health ecosystem.” How important is the global reach of Northeastern campuses to your work?

A big part of what Northeastern is going to do is generate network effects, leveraging Boston, leveraging the Roux Institute in Portland, Maine, and London and the West Coast of the United States.

If I’m working on data for European Union citizens, I often have to store and operate on that data in the EU. And by our very nature of having a broad global footprint, we have an opportunity to work with organizations and on datasets that we wouldn’t be able to do as easily as if we were exclusively housed in Boston.

At The Rockefeller Foundation, my team’s focus was on building technology products that closed the gap between data and action. For example, we built a holiday gathering risk assessment tool in 2021.

We also did quite a bit of work around wastewater surveillance, including a joint partnership with NASA and Emory Ghana to design surveillance settings in un-sewered settings.

What are some projects at the top of your list?

One of the initial things I’m looking to do at Northeastern is to help establish a research data commons, where datasets are accessible under agreed upon rules about how the data will be used.

You would log into a portal and under its terms of service you would have to agree to privacy constraints, ethical constraints. All of those things are effectively done ahead of time, which means that instead of spending three to six months with the lawyers negotiating what it is we can and can’t do, that is pre-packaged as part of your participation in the research data commons.

There are all these siloed datasets in health and life sciences right now, sitting in universities. This is an organized way to network them.

For media inquiries, please contact media@northeastern.edu.