Multidimensional Analysis on the Political Spectrum of Media Ideologies
for Master Seminar: Computational Social Science 

11.2022-02.2023

Professorship of Computational Social Science and Big Data,
Technical University of Munich » Website

Tutor: Prof. Jürgen Pfeffer

Munich, Bavaria, Germany


The contemporary field of information and communication has taken on a landscape in which many get used to being exposed only to ideas they already agree with, which has made news media, along with their target readerships turning more polarized and biased over the last few decades. Modern media outlets have a natural incentive to appear partisan, as could help bringing more stickiness of readers, which in turn creates more revenue. Once such filter bubbles form, people tend to get caught up in it unconsciously, becoming limited in their ability to understand each other and easily manipulated into thinking and voting. The entire ecology of public opinion platforms has become so much polarized, that sometimes many would find themselves no longer even able to find the truth.

To liberate people from filter bubbles, we decide to perform this multidimensional analysis making the opinions of different media transparent, which could greatly help the general public to access a wider variety of information, to easily identify different perspectives, and to avoid being manipulated by ideology biases.

Results could improve the societies’ long-term, assisting different groups of people in understanding each other better, solving real problems, learning the truth, and making wiser decisions.

Hence, two research questions were raised:

  • What are the positions held by media on different dimensions of the political spectrum, each of which regarding different social issues?
  • Is there any temporal shift in such positions, representing how different social groups’ attitudes towards different social issues has changed across time?

The study was conducted using 5,000 articles on topic “Climate Change” for each month from 2018 to 2022, from two media outlets The Daily Mail and Mail on Sunday (London) and The New York Times, which were considered correspondingly right-leaning and left-leaning on average. All articles were obtained from Nexis Uni®.

Poster for the final representation of the seminar Computational Social Science

Two pre-trained NLP frameworks, SentenceTransformers (BERT based) and TextBlob (NLTK based), were adopted for this project. For each single article, sentence embedding and sentiment analysis was performed first to obtain its vector representation in a 384-dimensional space, as well as its polarity and subjectivity scores. For the entire article corpus, a list of significant subject topic terms was obtained based on the statics of metadata accompanying the articles. This is followed by a series of unsupervised representation learning processes, consisting of sentence embedding, Principle Component Analysis and K-Means clustering, which enables the system to automatically decide and calculate the vector representations for different sub-topic clusters.

Afterwards, based on the ability of sentence embeddings to capture semantic differences through canonical distance metrics, by projecting the vectors of each article onto the corresponding feature vectors of different sub-topic clusters, we could finally obtain the relative sentiment representation components of each article in different sub-topic dimensions. Such data serves well for various downstream tasks, including time-series analysis and visualizations.

As can be observed through the final visualization in the poster, it is true that the chosen two media outlets did convey different sentiments on multiple sub-topics, and it is also true that these positions they took have shifted over time.

» Download the original poster