Northeastern computer engineer troubleshoots the world’s fastest supercomputers, where system failure can cost millions

by Tanner Stening

August 18, 2021

From battling the coronavirus to modeling the forces responsible for the creation of galaxies, supercomputers are helping to solve some of the most pressing problems in the world today. Photo by Matthew Modoono/Northeastern University

From battling the coronavirus to modeling the forces responsible for the creation of galaxies, supercomputers are helping to solve some of the most pressing problems in the world today.

But these mammoth high-performance computing systems, some of which require football-field-size floor space and tens if not hundreds of miles of cabling to store and operate, are prone to numerous kinds of system failures, glitches, and bugs. These problems, which are notoriously hard to predict, can be costly—causing lost money and productivity, says Devesh Tiwari, an assistant professor of electrical and computer engineering at Northeastern.

Tiwari has been working on how to best identify these large-system vulnerabilities and recently earned a Rising Star in Dependability Award at the 51st annual International Conference on Dependable Systems and Networks for his work on improving the reliability and cost-effectiveness of supercomputers.

Using his experience as a staff scientist at the Oak Ridge National Laboratory in Tennessee, which houses the world’s second most powerful supercomputer—and the nation’s first—called Summit, Tiwari developed methods for rooting out hardware failures, predicting future ones, and optimizing data storage.

Devesh Tiwari, Assistant Professor, Electrical and Computer Engineering. Photo by Matthew Modoono/Northeastern University

Ever-larger computer systems that perform increasingly complex tasks, and which rely on enormous amounts of power to operate, need to be reliable, Tiwari says, which in computing parlance is a measure of, among other things, how well a system can withstand threats and be repaired if there is a hardware failure.

It’s been a “famous problem” over the last couple decades—improving reliability and reducing costs—and one that many in the field are at work trying to solve using federal funds, Tiwari says. These improvements have implications across a range of sectors that rely on these sophisticated computer systems, from weather modeling and medical research to national security and military operations.

“You have all of these really large supercomputers that are trying to solve really important problems,” Tiwari says. “This is why reliability is so important.”

The “rising star” award, given to a researcher within a decade of starting their field work, recognizes Tiwari for work that he says is largely theoretical, but that has been successfully applied to real-world supercomputers—bridging a longstanding divide between theory and practice that he says has stymied collaboration between academic researchers and systems administrators for years.

“Theory work is generally not welcomed by practitioners,” Tiwari says. “What I did was show that my work impacted real systems—even though it was theoretical.”

Most advanced nations are competing to build the fastest supercomputer, Tiwari says. Such progress is being tracked by TOP500, a website that ranks the world’s best performing supercomputers, measured in terms of “petaflops,” or processing speed (floating point operations per second).

Currently, Japan’s Fugaku is the most powerful supercomputer in the world. The U.S. takes the second and third spots, followed by China in fourth. But China could be the first nation to operationalize a so-called “exascale” supercomputer, Tiwari says, which would in theory be faster than Fugaku.

But so-called “petascale” supercomputers, which are the quickest systems that presently exist, need to use up to 20 megawatts of power, Tiwari says. An individual computer failure within this network can create a drag on the system, costing potentially millions of dollars in power consumption within days.

These failures are of the utmost importance to catch and prevent from recurring, given the costs involved, Tiwari says.

“That’s a lot of money that you could have invested somewhere else,” Tiwari says. “Such energy can power a whole village in a developing country, like in India or Taiwan.”

For media inquiries, please contact media@northeastern.edu.

by Tanner Stening

August 18, 2021

More by Tanner Stening

How soon will pollsters have good data on a Harris-Trump matchup?

With Joe Biden out of the race, Kamala Harris’ path forward ‘will not be easy,’ experts say

Expert says ‘big chunk’ of Project 2025 could become policy during second Trump presidency

Editor's Picks

What do corporations need to ethically implement AI? Turns out, a philosopher

Business leaders should use human-centered approaches to AI adoption, Northeastern dean says

Expert advice: Coping strategies for navigating the 24-hour news cycle

Google’s brand ads are a “sham” but companies have to buy them anyway, new report finds

With the help of Northeastern, Tennessee Valley Authority experiments with a new forecast model to better predict extreme rainfalls

Featured Stories

They’re living boulders on the ocean floor. Northeastern research explains the mysterious corallith

Wendy Parmet became a public health giant. In true Northeastern fashion, it started with a co-op

With the help of Northeastern, Tennessee Valley Authority experiments with a new forecast model to better predict extreme rainfalls

Northeastern’s Summer Youth Employment Program expands in Oakland, empowering more high school students

What do corporations need to ethically implement AI? Turns out, a philosopher

Business leaders should use human-centered approaches to AI adoption, Northeastern dean says

Have MinuteClinics had their minute? Why retail health clinics are shutting their doors, and what’s next

Can you trust AI-powered search engines like OpenAI’s SearchGPT? Northeastern expert explains why she’s ‘extremely skeptical’

Shelley Stewart, a global supply chain leader, appointed to Northeastern University Board of Trustees

This Northeastern graduate is pioneering women’s leadership in Boston’s real estate development

What do corporations need to ethically implement AI? Turns out, a philosopher

Expert advice: Coping strategies for navigating the 24-hour news cycle

What can Kamala Harris learn from Donald Trump to win the 2024 presidential election?

How soon will pollsters have good data on a Harris-Trump matchup?

Can you trust AI-powered search engines like OpenAI’s SearchGPT? Northeastern expert explains why she’s ‘extremely skeptical’

Google’s brand ads are a “sham” but companies have to buy them anyway, new report finds

With the help of Northeastern, Tennessee Valley Authority experiments with a new forecast model to better predict extreme rainfalls

Legal scholar Patricia Williams explores race, bodily integrity and law in ‘The Miracle of the Black Leg’

10 books to add to your summer must-read list

Looking for cheese plate inspiration and recipes? This food stylist, connoisseur and influencer built a global community

Have MinuteClinics had their minute? Why retail health clinics are shutting their doors, and what’s next

Job applicants perceive AI-powered hiring process as more fair when it is blind to characteristics such as race or gender, new study finds

Why the Boston Celtics’ sale that could top $4.7 billion signals a booming market for sports franchises

Listeria outbreak linked to deli meats. Those who are pregnant are at severe risk, Northeastern expert warns

Northeastern cannabinoids researcher developing drugs to fight pain and inflammation

New treatments for Alzheimer’s cost tens of thousands of dollars a year. Here’s why

Is joking about Trump’s assassination attempt protected speech? You might not get charged, but you could lose your job, experts say

Can Donald Trump or Joe Biden play whatever music they want at a rally or convention? Legal expert says it’s more complicated

From factories to TikTok, how child labor laws are struggling to keep up with the digital revolution

Efforts to limit fast-food near homes need rethinking, Northeastern researcher says

Nike Dunks, Air Jordans, Yeezy slides: Huskick’s club is all about sneakers

Video: The story and science behind Rupee Beer, a lager designed to be paired with Indian food

From London to Paris: What the 2012 Olympics taught us about urban transformation

Falling out of a coconut tree into a ‘brat summer’ — why Kamala Harris is embracing meme culture

Donald Trump ‘has a new lease on life.’ Can a traumatic event like surviving a shooting change a person’s personality?

Northeastern graduate Fiona Howard named to 2024 U.S. Paralympic dressage team

Northeastern star Mike Sirota goes to the Cincinnati Reds in third round of Major League Baseball draft

Boston Unity Cup partners with Northeastern for international soccer celebration at Carter Playground

How soon will pollsters have good data on a Harris-Trump matchup?

With Joe Biden out of the race, Kamala Harris’ path forward ‘will not be easy,’ experts say

Expert says ‘big chunk’ of Project 2025 could become policy during second Trump presidency

What do corporations need to ethically implement AI? Turns out, a philosopher

Business leaders should use human-centered approaches to AI adoption, Northeastern dean says

Expert advice: Coping strategies for navigating the 24-hour news cycle

Google’s brand ads are a “sham” but companies have to buy them anyway, new report finds

With the help of Northeastern, Tennessee Valley Authority experiments with a new forecast model to better predict extreme rainfalls

.ngn-magazine__shapes {fill: var(--wp--custom--color--emphasize, #000) } .ngn-magazine__arrow {fill: var(--wp--custom--color--accent, #cf2b28) } NGN Magazine They’re living boulders on the ocean floor. Northeastern research explains the mysterious corallith

.ngn-magazine__shapes {fill: var(--wp--custom--color--emphasize, #000) } .ngn-magazine__arrow {fill: var(--wp--custom--color--accent, #cf2b28) } NGN Magazine Wendy Parmet became a public health giant. In true Northeastern fashion, it started with a co-op

With the help of Northeastern, Tennessee Valley Authority experiments with a new forecast model to better predict extreme rainfalls

Northeastern’s Summer Youth Employment Program expands in Oakland, empowering more high school students

Science & Technology

Recent Stories

They’re living boulders on the ocean floor. Northeastern research explains the mysterious corallith

Wendy Parmet became a public health giant. In true Northeastern fashion, it started with a co-op