News Analysis Using Text Network Visualization of RSS Feeds


Text network analysis can be used to better understand the news picture of the day and to analyze the current media discourse.

We frequently skim over headlines to know what’s happening in the world and while we get a general idea of the major news, we sometimes lose track of the relations between them. Using text network analysis we can bring those relations to light.

We can also see where news sources position themselves and how different news outlets cover the current picture of the day, what are the different agendas and how those agendas are expressed through different media.

Below we will present a case study of different news sources using as an example the news landscape of the 14th of April 2018: the day US, UK and France launched an attack on Syria. We will base our analysis on the news sources RSS feeds, creating the text network graphs from the headlines.

We will use InfraNodus text network visualization tool, which is based on the methodology presented in our paper on text network analysis.


1. Visualization of the Major News Sources

We will use RSS import functionality of InfraNodus to import 30 most recent news (which covers the whole day) from the top 5 newspapers:

The Guardian
The New York Times
The Washington Post
The Wall Street Journal
The Financial Times

The algorithm at the basis of InfraNodus is based on converting the words into the graph nodes. After the stopwords (such as “the”, “a”, “in” etc.) are removed and the remaining words are converted into their lemmas, the lemmas that co-occur next to each other will appear to be connected. The more certain words are used in the same context, the more distinguished they will be (as a group) from the rest of the network. Based on this information we can color-code different communities of words (or topics) based on Blondel et al (2001) community detection algorithm.

You can see the visualization below (made at 16.30 CET 14/04/2018):

It can be seen the majority of the publications are talking about the Syria strike and president Trump.

A closer inspection of the different clusters (topical communities) inside the graph indicate that there are other important topical clusters as well:


the pane at the bottom left identified other important news clusters: “battle-china-sebastian” and “russia-rise-back” — if we click on the keywords in this community, InfraNodus will automatically filter out the news that contain those keywords in headlines. In this case we are referring to the news piece from Washington Post, which talks about the rising tensions with Russia as the result of the Syria attack.

Furthermore, if we analyze what newspapers wrote about Russia (click on the keyword in the graph above) we will see that most of the articles talking about the reaction to the attack (condemning it) come from the RSS feed of The Washington Post. Therefore, The Washington Post gave more emphasis to the reaction of Russia than other newspapers in their RSS newsfeeds.


2. Comparison Between English-language and French Media’s Discourse

Text network analysis of RSS newsfeeds can also be used to compare how media in different countries tend to cover the same events.

Below is a visualization of the RSS news feeds of the leading French media:
Le Monde (left-central oriented)
Liberation (left-oriented)
Le Parisien (central)
Le Figaro (right-oriented)
BFM tv (business-oriented)

We observe a similar picture: most of the news in RSS feeds of the French media are focused on “frappé” (strike) and “Syria” (with a much lower attention to Trump comparing to English-language RSS feeds). Instead, they focus on the technicalities of the strike (“naval crosiere”) and on the fact that France had a chance to test its new weapon (which, it could then be also assumed, was one of the reasons the strike had been made). They also gave much more focus on reiterating the fact that it was presumed the chemical weapons were found, justifying the strike.


The French media RSS feeds also gave much more (relative) attention to the director Milos Forman’s death that happened on that same day.


3. Differences Between Newspaper Coverage

It may also be interesting to analyze how different media outlets have a different approach to coverage. Below we visualized the top 30 RSS newsfeeds (World section) for The Guardian:

and for The Washington Post:

and The New York Times

We can see from above that both the New York Times and The Washington Post gave a lot of attention to Syria and the strike (17% and 19% respectively to the top Syria topic), while The Guardian’s news feed seemed to be much more diverse (the top topic is the book on Trump and Syria is less than 14%). There’s a lot of attention to Syria but also to other events of the day, which means that the main news do not obfuscate the rest.

At the same time, the way both the Guardian and The New York Times cover the war in Syria seems to be more one-sided than The Washington Post, which has a variety of articles on the subject of Syria, presenting the different points of view.

Finally, we can also compare

The Guardian and The New York Times Coverage:

The Guardian and The Washington Post Coverage:

and The Washington Post and The New York Times Coverage:


We can see that The Washington Post and The New York Times have the most different coverage from each other (42% of the terms used are different).

We can add Russia Today into the mix and compare it with The Guardian Coverage:


and here’s a comparison between The New York Times and Russia Today:


It can be seen clearly that the coverage of Russia Today differs from the coverage of both The Guardian and The New York Times quite significantly.

However, a closer look at the discourse network of Russia Today coverage (and also at the graphs above) shows that the graph structure of Russia Today RSS feed is much more connected towards the central topics of Syria. In this way Russia Today lives up to its slogan “question more”: providing a more diverse set of opinions on the main topic of Syria. However, a highly centralized structure of the graph indicates that it does not include other subjects, unlike The Guardian or The New York Times, making its overall world coverage less diverse, focussing excessively on the Syrian conflict and its repercussions for Russia.