Skip to content

SOTU language analysis reveals history’s ‘twists and turns’

“Luck,” “lows,” and “lesbian.” “Dodge,” “dusted,” and “drowning.” “Tesla,” “eBay,” and “Instagram.”

Ben Schmidt was up late Tuesday night, tweeting out these and more than 70 other words that had never appeared in a State of the Union address until President Barack Obama delivered his annual report to Congress that evening.

“Some of the words were surprising, but many others were tactical and say something important about the state of American society today,” said Schmidt, an assistant professor of history and a core faculty member in the NU Lab for Texts, Maps, and Networks, Northeastern’s center for digital humanities and computational social science.

Schmidt analyzed the language of Obama’s presidential address—and that of every SOTU speech dating back to 1790—using Bookworm, a simple and powerful way to visualize trends in digitized texts. He and a Harvard colleague created the platform for text analysis in 2011, and have since used it to examine the language used in everything from newspaper articles to more than 500 episodes of The Simpsons.

Schmidt’s latest project was made possible in part by a grant from the National Endowment for the Humanities, which enabled him to enhance his Bookworm tool through collaboration with the HathiTrust Digital Library, which holds 3.9 billion pages of digitized materials.

He discussed the project in an interactive article in The Atlantic, for which he used Bookworm to comb through all 224 State of the Union addresses and rank the frequency with which each president used each word. In an interactive companion piece, he and Mitch Fraas of the University of Pennsylvania used natural language processing algorithms to identify more than 16,000 mentions of 1,410 different places that presidents have referenced since the very first State of the Union more than 200 years ago. Foreign policy historians including Gretchen Heefner, an assistant professor of history at Northeastern, provided historical context, explaining how the speeches reflect America’s changing role in the world.

The findings, Schmidt wrote in The Atlantic, “reveal how the words presidents use reflect the twists and turns of American history.”

The word “freedom,” he said, was used sparingly until Franklin D. Roosevelt placed the “Four Freedoms” at the center of his 1941 address. Since then, the word has gained popularity, particularly among Republican presidents. George W. Bush, for example, said “freedom” more than 70 times, while Obama has used the word fewer than 10 times, including once Tuesday night.

Like “freedom,” “college” features an unmistakable partisan tilt. Democratic presidents, such as Obama, Bill Clinton, and John F. Kennedy, have used the word far more than Republican presidents, such as Bush, Ronald Reagan, and Gerald Ford. Bush, for example, said “college” five times, while Obama has used the word on more than 50 occasions, including 12 times Tuesday night.

His speech on Tuesday evening also included the use of several words that had not been spoken at a State of the Union in more than 100 years. According to Schmidt, Obama said “vacations” for the first time since Millard Fillmore used the word in 1851 and uttered “masterful” for the first time since Theodore Roosevelt’s 1901 address.

Meanwhile, his references to China (three) and the Middle East (two) dovetailed with the rhetorical choices of recent presidents, whose language has reflected their interest in particular countries and regions. According to Schmidt and Heefner’s account in The Atlantic, the Middle East became a fixture of presidential addresses following the 1979 Iran hostage crisis, while China—whose “mentions in the State of the Union follow the sine wave of American interest”— once again became a frequent topic of presidential addresses following Nixon’s 1972 visit.

“In many ways, the places Obama mentioned continued the trend of the past few decades,” Schmidt explained. “His speech focused on a band of countries in the Middle East, while there was little discussion of Africa, which has been characteristic of presidential address in general and Obama’s in particular.”