The Structure of Online Diffusion Networks

In this talk given at Interdisciplinary workshop on the evolution of social norms Sharad Goel talks about the structure of online diffusion networks. We found it interesting as it relates to our recent research on information epidemics. Goel offers quite an interesting perspective on the nature of information diffusion in that it occurs less than 7% of times and that it normally occurs in bursts rather than gradual propagation.

The researchers at Yahoo! looked into how URLs propagate through Twitter conversations. How do links become viral? They took a dataset of about 1 Mln tweets and put it through their computers to find how propagation and informational cascades work.

They found that in the 93% of cases there was no diffusion occurring. Only a fraction of diffusions become cascades and turn into epidemics. However, even though the number of these cascades is small the actual structure they have can become globally distributed tree-like network. The long tail of “infected” nodes is significant enough so that the majority of nodes can be overtaken by the epidemics.

The researchers then decided to create their own “viral” URLs and tried to propagate them through Twitter network. They defined “viral” in the conservative way that the product is “viral” when the percentage of adopters is relatively high is sufficiently far from the seed (at least 2 generations away). They found that only 1% of URLs have at least 50% of their adopters far from the seed and only 1 in 10000 have at least 90% of their adopters far from the seed. Therefore, vast majority of times the events wouldn’t even come close to satisfying the “viral” criteria defined in this conservative way. The rare large “viral” events rarely look like tree-like propagation structures. Rather, they are characterized by bursts: for example, a URL is broadcasted to several people who then broadcast it further and then it’s re-broadcasted to a large number of followers simultaneously.

Goel also emphasizes the difference between something that’s “viral” and “popular”. Viral is something that has the capacity to propagate itself a few nodes away from the initial seed. When someone watches a video a million times because it was broadcasted to a large audience it’s not necessarily a viral video. Rather, viral events are the very rare events when the message itself contains incentives sufficient enough for the participants to propagate the network.

He also draws a difference between the viral information contagion and biological epidemiological models. Goel argues that the spread of desease is less costly than information contagion. In order to communicate a certain piece of information to another person we have to make an effort of contacting them, recommending them, engaging into interaction or even persuading them to use something. He also shows with his models that the diffusion of information in online networks occurs in a different way than the desease contagion in biological networks. Goel proposes that one of the ways to make a product go “viral” is to make its diffusion less time-consuming and less expensive for users as well as embedding incentives into its structure that compels individuals to propagate it further.