Projects in REVISE

ChARM: Chat Control, Age Verification and Resilience for Minors

The project explores the technical protection of minors on the internet and the challenges in detecting illegal content such as CSAM and cyber grooming, particularly in the context of the planned EU regulations. The discussion around measures like scanning online communication raises questions about privacy and the accuracy of technologies, which still have high error rates. ATHENE provides an environment to develop and evaluate technological solutions to improve the protection of minors and comply with EU requirements. In the CHARM project, the state of the art is analyzed, demonstrators are created, and new protection methods are developed to better inform policymakers, businesses, and society.


CRISIS: Cross-Domain Disinformation Analysis

Nowadays, news is increasingly spread and consumed via social media. Since most of the posts do not undergo any verification process prior to their publication, there is a strong risk that posts contain false information. The project CRISIS aims to examine posts in social media for disinformation, where information is present in the form of texts, images, videos, and audio recordings. Various machine learning analysis methods are to be employed in order to

  1. trace the dissemination paths of (dis)information or to identify themes and trends (Social Media Analytics)
  2. recognize maliciously "recycled" content and trace it back to its original source as well as to match spread information with previously conducted fact-checks (Semantic Similarity Analysis)
  3. and to support manual fact-checking process by preselecting media or specific sections of media that are particularly worth checking (Check-Worthiness Analysis).

The results will be integrated into a demonstrator that will support practitioners, such as journalists and fact-checkers, in investigating and identifying disinformation.


CYNTRA – Towards an Effective Multi-Label Classification and Model Auditing Ecosystem for Combatting Textual Online Hate Speech

Online hate speech poses growing challenges for law enforcement agencies and hate speech reporting centers. The integration of the revised EU voluntary code of conduct into the Digital Services Act framework strengthens these organizations' role as trusted flaggers. Simultaneously, major social media platforms are reducing moderation capacities – case volumes are increasing while manual processing reaches its limits.

CYNTRA develops a comprehensive ecosystem for analyzing textual online hate speech through AI-based multi-label classification. Conducting empirical case studies in Germany and the United Kingdom, the project compares Romano-Germanic and Anglo-American approaches to hate speech classification. The ecosystem comprises enhanced datasets with expert annotations, adjustable classification models including large language model prompting, and a user-centered dashboard for case analysis and system auditing. The objective is to prioritize reports more efficiently and continuously adapt AI models to evolving legal and linguistic requirements.


DREAM: Deepfake REcognition and Artificial Media

The DREAM project studies different methods for recognizing and identifying synthesized or manipulated media content that have been generated through artificial intelligence. A special focus is placed on detecting manipulations in various forms of media, namely images, videos and audios which are generated with the intention to impersonate. So-called deepfakes are able to automatically replace faces appearing in images or videos with faces of any person using "deep learning". Images can be generated by text-to-image synthesis methods such as DALL-E, StableDiffusion or Midjourney. For videos, face swapping or facial reenactment techniques such as "lip-sync-attacks" can be used. For audio data, the voice of a specific target person is imitated, e.g. through voice conversion or text-to-speech synthesis, so that words can be put into their mouth. To gain a better understanding of multimodal manipulations, the project also involves generating these types of fake media.


TRACE: Tracing and Recognizing AI-generated Conten and Evidence

TRACE is a research project on advanced forensics for generative AI and deepfakes. As synthetic audio, video, images and text become increasingly realistic, simply detecting that something is fake is no longer enough. In real investigations such as political deepfakes, financial fraud with fake executives in video calls, or AI-generated robocalls in elections, investigators need to know how the content was created, which tools were used and what source material may have been involved.

TRACE focuses on model provenance and AI fingerprinting. It looks for subtle, consistent traces left by generative models to identify which systems or methods produced a given piece of media. The project aims to profile the generative process itself, uncover model-specific artifacts, infer manipulation techniques and generate metadata that can be used as robust forensic evidence.

Within the REVISE program and alongside ATHENE projects like DREAM, TRACE adds an explanatory layer to classic deepfake detection. It bridges technical forensics and legal requirements, supports fact-checking and law enforcement, and improves the analysis of disinformation patterns and early warning systems. The goal is to strengthen media security and digital trust by making AI-generated content more transparent, traceable and accountable.


TXAITD: Trustworthy and Explainable AI-generated Text Detection

The unprecedented capabilities of recent large language models such as ChatGPT and Bard lead to their increased use as writing assistants. However, since these models produce text that is often very difficult to distinguish from human-written content, they are also increasingly used for malicious purposes, such as automatically writing assignments, composing AI-generated scientific papers, spreading fake news, and conducting social engineering attacks. To combat these issues, this project focuses on developing Trustworthy and Explainable AI-generated Text Detection (TXAITD) technology. TXAITD aims to identify passages created by language models, providing explanations and fine-grained localization of AI usage within texts. This approach helps in differentiating between benign uses of AI as writing assistants and malicious activities. Unlike previous systems, TXAITD makes the detection process explainable and trustworthy to empower human users and decision-makers to make informed judgments about digital content, thereby enhancing societal and personal security. Ultimately, this project contributes to the secure digital transformation and the regulation of AI in text composition.