Cited by USTHB at ArAIEval’23 Shared Task: Disinformation Detection System based on Linguistic Feature Concatenation

SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup DOI

Jakub Piskorski,

Nicolas Stefanovitch,

Giovanni Da San Martino

и другие.

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), Год журнала: 2023, Номер unknown

Опубликована: Янв. 1, 2023

We describe SemEval-2023 task 3 on Detecting the Category, Framing, and Persuasion Techniques in Online News a Multilingual Setup: dataset, organization process, evaluation setup, results, participating systems. The focused news articles nine languages (six known to participants upfront: English, French, German, Italian, Polish, Russian), three additional ones revealed at testing phase: Spanish, Greek, Georgian). featured subtasks: (1) determining genre of article (opinion, reporting, or satire), (2) identifying one more frames used an from pool 14 generic frames, (3) identify persuasion techniques each paragraph article, using taxonomy 23 techniques. This was very popular task: total 181 teams registered participate, 41 eventually made official submission test set.

Язык: Английский

Процитировано

Hierarchical graph-based integration network for propaganda detection in textual news articles on social media DOI

Pir Noman Ahmad, J. Guo,

Nagwa M. AboElenein

и другие.

Scientific Reports, Год журнала: 2025, Номер 15(1)

Опубликована: Янв. 13, 2025

During the Covid-19 pandemic, widespread use of social media platforms has facilitated dissemination information, fake news, and propaganda, serving as a vital source self-reported symptoms related to Covid-19. Existing graph-based models, such Graph Neural Networks (GNNs), have achieved notable success in Natural Language Processing (NLP). However, utilizing GNN-based models for propaganda detection remains challenging because challenges mining distinct word interactions storing nonconsecutive broad contextual data. In this study, we propose Hierarchical Graph-based Integration Network (H-GIN) designed detecting text within defined domain using multilabel classification. H-GIN is extracted build bi-layer graph inter-intra-channel, Residual-driven Enhancement (RDEP) Attention-driven Multichannel feature Fusing (ADMF) with suitable labels at two classification levels. First, RDEP procedures facilitate information between distant nodes. Second, by employing these guidelines, ADMF standardizes Tri-Channels 3-S (sequence, semantic, syntactic) layer, enabling effective through unrelated propagation news representations into classifier from existing ProText, Qprop, PTC datasets, thereby ensuring its availability public. The model demonstrated exceptional performance, achieving an impressive 82% accuracy surpassing current leading models. Notably, model's capacity identify previously unseen examples across diverse openness scenarios ProText dataset was particularly significant.

Язык: Английский

Процитировано

ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text DOI

Maram Hasanain, Firoj Alam, Hamdy Mubarak

и другие.

Опубликована: Янв. 1, 2023

We present an overview of the ArAIEval shared task, organized as part first ArabicNLP 2023 conference co-located with EMNLP 2023. offers two tasks over Arabic text: (1) persuasion technique detection, focusing on identifying techniques in tweets and news articles, (2) disinformation detection binary multiclass setups tweets. A total 20 teams participated final evaluation phase, 14 16 participating Task 1 2, respectively. Across both tasks, we observe that fine-tuning transformer models such AraBERT is core majority systems. provide a description task setup, including datasets construction setup. also brief All scripts from are released to research community. hope this will enable further important within NLP

Язык: Английский

Процитировано

Data Augmentation Using Transformers and Similarity Measures for Improving Arabic Text Classification DOI

Dania Refai, Saleh M. Abu-Soud, Mohammad J. Abdel‐Rahman

и другие.

IEEE Access, Год журнала: 2023, Номер 11, С. 132516 - 132531

Опубликована: Янв. 1, 2023

The performance of learning models heavily relies on the availability and adequacy training data. To address dataset issue, researchers have extensively explored data augmentation (DA) as a promising approach. DA generates new instances through transformations applied to available data, thereby increasing size variability. This approach has enhanced model accuracy, particularly in addressing class imbalance problems classification tasks. However, few studies for Arabic language, relying traditional approaches such paraphrasing or noising-based techniques. In this paper, we propose method that employs recent powerful modeling technique, namely AraGPT-2, process. generated sentences are evaluated terms context, semantics, diversity, novelty using Euclidean, cosine, Jaccard, BLEU distances. Finally, AraBERT transformer is used sentiment tasks evaluate augmented dataset. experiments were conducted four datasets: AraSarcasm, ASTD, ATT, MOVIE. selected datasets vary size, label number, unbalanced classes. results show proposed methodology text all with an increase F1 score by 7% 8% 11% 13%

Язык: Английский

Процитировано

Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques DOI

Jakub Piskorski,

Nicolas Stefanovitch,

Nikos Nikolaidis

и другие.

Опубликована: Янв. 1, 2023

Jakub Piskorski, Nicolas Stefanovitch, Nikolaos Nikolaidis, Giovanni Da San Martino, Preslav Nakov. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

Язык: Английский

Процитировано

Toxic language detection: A systematic review of Arabic datasets DOI

Imene Bensalem, Paolo Rosso,

Hanane Zitouni

и другие.

Expert Systems, Год журнала: 2024, Номер 41(8)

Опубликована: Янв. 30, 2024

Abstract The detection of toxic language in the Arabic has emerged as an active area research recent years, and reviewing existing datasets employed for training developed solutions become a pressing need. This paper offers comprehensive survey focused on online language. We systematically gathered total 54 available their corresponding papers conducted thorough analysis, considering 18 criteria across four primary dimensions: availability details, content, annotation process, reusability. analysis enabled us to identify gaps make recommendations future works. For convenience community, list analysed is maintained GitHub repository.

Язык: Английский

Процитировано

MAGENTA: Generating and Detecting Arabic Machine-Generated Text in Multiple Domains DOI

Saad Yaquine,

Amine Hmimou,

Paolo Rosso

и другие.

Communications in computer and information science, Год журнала: 2025, Номер unknown, С. 151 - 159

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

Corpus Analysis of COVID-19 Related Loneliness on Twitter DOI

Chereen Shurafa, Wajdi Zaghouani

Communications in computer and information science, Год журнала: 2025, Номер unknown, С. 80 - 93

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

A comprehensive survey on Arabic text augmentation: approaches, challenges, and applications DOI

Ahmed Adel ElSabagh,

Shahira Shaaban Azab,

Hesham A. Hefny

и другие.

Neural Computing and Applications, Год журнала: 2025, Номер unknown

Опубликована: Фев. 7, 2025

Язык: Английский

Процитировано

Effective Yet Ephemeral Propaganda Defense: There Needs to Be More than One-Shot Inoculation to Enhance Critical Thinking DOI

Nicolas Hoferer,

Kilian Sprenkamp, Dorian Quelle

и другие.

Опубликована: Апрель 23, 2025

Язык: Английский

Процитировано