Random acts of the Twitterverse

A visualization of how #GOP spreads across the Twitter network…looks a bit like what I’d imagine “preaching to the choir” might.

Last week a coworker tried to explain the ins and outs of Twitter to me with little success. I get the point, really I do — it’s just that I find the information overload issue impossible to circumnavigate. “You just have to ignore some of it,” she said.

This idea of a finite attention span forms part of the foundation for predicting the spread of ideas on Twitter, according to new research from Northeastern’s Sternberg Family Distinguished Professor, Alessandro Vespignani and his collaborators.

In an article published on the Nature Scientific Reports website last month, Vespignani and his team showed that the social network’s structure, coupled with our limited cognitive capacity, has a larger impact on the spread of ideas than the relative importance of those ideas or the people Tweeting about them.

“Imagine that all the information is equal,” said Vespginani. ” Well, what you’d expect is that the information will be selected equally so that all the messages will have the same lifetime and the same number of users or spectators.” But this is not the case.

The team designed a computational model that mimics the so-called “Twitterverse” based on real-world data from 1.3 million hashtags and 120 million retweets from 12.5 million users. They stripped away all of the external factors, such as mainstream media and world events, that could change the “value” of each Tweet. The model included two assumptions: first, that the network in the virtual Twitterverse looks like the real one and, second, that our attention span is finite (mine is clearly much shorter than the average user, just looking at the feed breaks my brain). The propagation of Tweets across this modeled network matched perfectly with the real world propagation of Tweets.

Visualizations of Tweets related to the Arab Spring and the March 2011 earthquake in Japan

When they changed the parameters, for example by adding relative values to the virtual Tweets or using a random network instead of the social network structure, the model strayed from actual data.

An article on the Atlantic’s technology blog says “the research suggest that it doesn’t fully matter who you are or how many connections you have, but what you’re saying relative to the existing conversation is what really matters in spreading knowledge.” Before talking to Vespignani, I tweeted this (!), but actually it’s kind of wrong. In fact, it doesn’t seem to matter who you are, how many connections you have, or what you’re saying. It’s much more random than any of that.

It’s like a neutral evolution, Vespignani said, wherein something evolves not because it’s better fit to survive, but because of random circumstances. “And in some cases things that are very fit may just disappear.”

That doesn’t mean that information doesn’t have value, he said. “But actually this is a model that tells you you don’t have to invoke those properties to explain what you see in terms of the lifetime of ideas. What we believe is that most of what we see in the social networks in terms of rumors, ideas, or the spreading of knowledge, is due to the stochasticity in social networks.”

But why do we care about all this in the first place? Vespignani, who is the local PI for a collaborative NSF grant with Indiana University to explore the spread of knowledge, can envision a time when we’ll use the Tweets leading up to events to predict their future occurrence. For example, could we have predicted the Arab spring or the outcome of an election based on what was taking place on Twitter just prior? Vespignani believes yes, once we understand how the system works we will be able to use it to predict outcomes.