The following group of projects seeks to (semi-)automatically identify slanted news coverage, i.e., media bias, in news articles. Fundamentally, we aim to approach the issue of media bias by combining the expertise of two academic disciplines: computer science and the social sciences. Specifically, we employ automated and efficient text analysis methods, such as natural language processing (NLP), with manual and effective media bias analysis concepts, such as frame analysis.
- news-please is an open source, easy-to-use news crawler that extracts structured information from almost any news website. It can follow recursively internal hyperlinks and read RSS feeds to fetch both most recent and also old, archived articles.
- NewsBird is a news aggregator that implements matrix-based news aggregation (MNA). In MNA, users explore different perspectives in news coverage by visually inspecting a two-dimensional matrix, which, for example, shows the main media perspective within one country about another country.
- Givem5W1H is a system that extracts the journalistic five W and one H (5W1H) questions from news articles, i.e., who did what, when, where, why, and how.
- XCoref is an end-to-end cross-document coreference resolution (CDCR) system aiming at resolving entity, event, and more abstract concepts with a word choice and labeling diversity from a set of related articles.
- DA-RoBERTa is a new state-of-the-art transformer-based model adapted to the media bias domain which identifies sentence-level bias by outperforming all previous models.
- newsalyze is the first system to automatically identify and then communicate person-targeting forms of bias in news articles reporting on policy issues by 1) applying XCoref to resolve mentions referring to the person entities, 2) extracting frames by identify how semantic concepts are portrayed in a given news text, e.g., positively or negatively, 3) visualizing how the identified persons are portrayed in a set of related news articles. A prototype scheme presented by Felix Hamborg in his Ph.D. thesis “Towards Automated Frame Analysis : Natural Language Processing Techniques to Reveal Media Bias in News Articles”
- Domain-adaptive Pre-training Approach for Language Bias Detection in News is a new state-of-the-art transformer-based model adapted to the media bias domain which identifies sentence-level bias called DA-RoBERTa, which performs a challenging task of detecting biased word choices full of linguistic complexity in a setup of the lack of representative gold-standard corpora.