VICTOR VECTORS @ DIPROMATS 2024: Propaganda Detection with LLM Paraphrasing and Machine Translation

Abstract
Identifying propaganda in social media posts is an important task that can help to better understand the strategies applied by policy makers and stake holders when trying to convey their message to the general public. We describe our participation in DIPROMATS 2024 Task 1 on the automated detection and characterization of propaganda techniques and narratives from diplomats of major powers. We show an efficient way to utilize Large Language Models (LLMs) to paraphrase a sample of the training instances, to balance the class distribution in the datasets provided by the shared task. Our submission ranked 1st in Subtask-1a in English (ICM score of 0.2123) and 1st in the bilingual evaluation (ICM score of 0.2048). We also achieved top-3 rankings in Spanish and subtasks 1b and 1c.
Description
Keywords
Data augmentation, LLMs, Paraphrasing, Propaganda detection, Unbalanced data
Citation