AugmenToxic: Leveraging Reinforcement Learning to Optimize LLM Instruction Fine-Tuning for Data Augmentation to Enhance Toxicity Detection DOI
Arezo Bodaghi, Benjamin C. M. Fung, Ketra Schmitt

et al.

ACM Transactions on the Web, Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 29, 2024

Addressing the challenge of toxic language in online discussions is crucial for development effective toxicity detection models. This pioneering work focuses on addressing imbalanced datasets by introducing a novel approach to augment data. We create balanced dataset instructing fine-tuning Large Language Models (LLMs) using Reinforcement Learning with Human Feedback (RLHF). Recognizing challenges collecting sufficient samples from social media platforms building dataset, our methodology involves sentence-level text data augmentation through paraphrasing existing optimized generative LLMs. Leveraging LLM, we utilize Proximal Policy Optimizer (PPO) as RL algorithm fine-tune model further and align it human feedback. In other words, start LLM an instruction specifically tailored task while maintaining semantic consistency. Next, apply PPO reward function, (optimize) instruction-tuned LLM. process guides generating responses. Google Perspective API evaluator assess generated responses assign rewards/penalties accordingly. LLMs transforming minority class into augmented versions. The primary goal diverse enhance accuracy performance classifiers identifying instances class. Utilizing two publicly available datasets, compared various techniques proposed method samples, demonstrating that outperforms all others producing higher number samples. Starting initial 16,225 prompts, successfully 122,951 score exceeding 30%. Subsequently, developed applied cost-sensitive learning original dataset. findings highlight superior trained method. These results importance employing data-agnostic mechanism augmenting data, thereby enhancing robustness

Language: Английский

A Survey of Explainable Artificial Intelligence for Smart Cities DOI Open Access
Abdul Rehman Javed, Waqas Ahmed, Sharnil Pandya

et al.

Electronics, Journal Year: 2023, Volume and Issue: 12(4), P. 1020 - 1020

Published: Feb. 18, 2023

The emergence of Explainable Artificial Intelligence (XAI) has enhanced the lives humans and envisioned concept smart cities using informed actions, user interpretations explanations, firm decision-making processes. XAI systems can unbox potential black-box AI models describe them explicitly. study comprehensively surveys current future developments in technologies for cities. It also highlights societal, industrial, technological trends that initiate drive towards presents key to enabling detail. paper discusses cities, various technology use cases, challenges, applications, possible alternative solutions, research enhancements. Research projects activities, including standardization efforts toward developing are outlined lessons learned from state-of-the-art summarized, technical challenges discussed shed new light on possibilities. presented is a first-of-its-kind, rigorous, detailed assist researchers implementing XAI-driven systems, architectures, applications

Language: Английский

Citations

117

Modern Smart Cities and Open Research Challenges and Issues of Explainable Artificial Intelligence DOI
Siva Raja Sindiramutty, Chong Eng Tan, Wee Jing Tee

et al.

Advances in computational intelligence and robotics book series, Journal Year: 2024, Volume and Issue: unknown, P. 389 - 424

Published: Jan. 18, 2024

This chapter's purpose is to review the modern smart cities and open research challenges issues of explainable artificial intelligence (XAI). With advent XAI, people's lives have been improved, idea urban has created. Although anticipated advantages, adoption AI differs between in part because that can prevent cities. chapter will explore importance XAI what current state art various applications cities, issue case studies examples, evaluations analysis models city application. The be covering developing Novel with ontologies, assurance ML algorithms, scalability etc.

Language: Английский

Citations

17

Graph convolution networks for social media trolls detection use deep feature extraction DOI Creative Commons
Muhammad Asif, Muna Al‐Razgan, Yasser A. Ali

et al.

Journal of Cloud Computing Advances Systems and Applications, Journal Year: 2024, Volume and Issue: 13(1)

Published: Feb. 6, 2024

Abstract This study presents a novel approach to identifying trolls and toxic content on social media using deep learning. We developed machine-learning model capable of detecting images through their embedded text content. Our leverages GloVe word embeddings enhance the model's predictive accuracy. also utilized Graph Convolutional Networks (GCNs) effectively analyze intricate relationships inherent in data. The practical implications our work are significant, despite some limitations performance. While accurately identifies more than half time, it struggles with precision, correctly positive instances less 50% time. Additionally, its ability detect all cases (recall) is limited, capturing only 40% them. F1-score, which measure balance between precision recall, stands at around 0.4, indicating need for further refinement effectiveness. research offers promising step towards effective monitoring moderation platforms.

Language: Английский

Citations

11

Data Augmentation-based Novel Deep Learning Method for Deepfaked Images Detection DOI Open Access
Farkhund Iqbal, Ahmed Abbasi, Abdul Rehman Javed

et al.

ACM Transactions on Multimedia Computing Communications and Applications, Journal Year: 2023, Volume and Issue: 20(11), P. 1 - 15

Published: April 13, 2023

Recent advances in artificial intelligence have led to deepfake images, enabling users replace a real face with genuine one. images recently been used malign public figures, politicians, and even average citizens. but realistic stir political dissatisfaction, blackmail, propagate false news, carry out bogus terrorist attacks. Thus, identifying from fakes has got more challenging. To avoid these issues, this study employs transfer learning data augmentation technique classify images. For experimentation, 190,335 RGB-resolution image methods are prepare the dataset. The experiments use deep models: convolutional neural network (CNN), Inception V3, visual geometry group (VGG19), VGG16 approach. Essential evaluation metrics (accuracy, precision, recall, F1-score, confusion matrix, AUC-ROC curve score) test efficacy of proposed Results revealed that approach achieves an accuracy, F1-score score 90% 91% our fine-tuned model outperforming other DL models recognizing deepfakes.

Language: Английский

Citations

18

Analysis of criminal spatial events in india using exploratory data analysis and regression DOI
Urvashi Gupta, Rohit Sharma

Computers & Electrical Engineering, Journal Year: 2023, Volume and Issue: 109, P. 108761 - 108761

Published: May 19, 2023

Language: Английский

Citations

18

A survey and comparative study on negative sentiment analysis in social media data DOI
Jayanta Paul,

Ahel Das Chatterjee,

Devtanu Misra

et al.

Multimedia Tools and Applications, Journal Year: 2024, Volume and Issue: 83(30), P. 75243 - 75292

Published: Feb. 15, 2024

Language: Английский

Citations

4

Identification and Correction of Grammatical Errors in Ukrainian Texts Based on Machine Learning Technology DOI Creative Commons
Vasyl Lytvyn, Петро Пукач, Victoria Vysotska

et al.

Mathematics, Journal Year: 2023, Volume and Issue: 11(4), P. 904 - 904

Published: Feb. 10, 2023

A machine learning model for correcting errors in Ukrainian texts has been developed. It was established that the neural network ability to correct simple sentences written Ukrainian; however, development of a full-fledged system requires use spell-checking using dictionaries and checking rules, both those based on result parsing dependencies or other features. In order save computing resources, pre-trained BERT (Bidirectional Encoder Representations from Transformer) type used. Such networks have half as many parameters models show satisfactory results grammatical stylistic errors. Among ready-made models, mT5 (a multilingual variant T5 Text-to-Text Transfer showed best performance according BLEU (bilingual evaluation understudy) METEOR (metric translation with explicit ordering) metrics.

Language: Английский

Citations

8

Classifying toxicity in the Arabic Moroccan dialect on Instagram: a machine and deep learning approach DOI Open Access
Rabia Rachidi, Mohamed Amine Ouassil, Mouaad Errami

et al.

Indonesian Journal of Electrical Engineering and Computer Science, Journal Year: 2023, Volume and Issue: 31(1), P. 588 - 588

Published: May 17, 2023

People crave interaction and connection with other people. Therefore, social media became the center of society’s life. Among brightest platforms nowadays a massive number daily users there is Instagram, which due to its distinctive features. The excessive revealing personal life has put in spots getting bullied harassed toxic revues from users. Numerous studies have targeted fight harmful side effects. Nevertheless, most datasets that were already available English, Arabic Moroccan dialect ones not. In this work, dataset been extracted Instagram platform. Furthermore, feature extraction techniques applied collected increase classification accuracy. Afterward, we developed models using machine learning deep algorithms detect classify toxicity. For models’ evaluation, used metrics: accuracy, precision, F1-score, recall. experimental results gave modest scores around 70% 83%. These imply need improvement lack preprocessing libraries handle Arabic.

Language: Английский

Citations

8

Extraction of use case diagram elements using natural language processing and network science DOI Creative Commons
Maryam Imtiaz Malik, Muddassar Azam Sindhu, Rabeeh Ayaz Abbasi

et al.

PLoS ONE, Journal Year: 2023, Volume and Issue: 18(6), P. e0287502 - e0287502

Published: June 23, 2023

Software engineering artifact extraction from natural language requirements without human intervention is a challenging task. Out of these artifacts, the use case plays prominent role in software design and development. In literature, most approaches are either semi-automated or necessitate formalism make restricted for cases textual requirements. this paper, we resolve challenge automated We propose an approach to generate cases, actors, their relationships Our proposed involves no formalism. To automate approach, have used Natural Language Processing Network Science. provides promising results elements validate using several literature-based studies. The significantly improves comparison existing approach. On average, achieves around 71.5% accuracy (F-Measure), whereas baseline method 16% (F-Measure) on average. evaluation studies shows its significance reduces effort

Language: Английский

Citations

6

The Construction of a Digital Dissemination Platform for the Intangible Cultural Heritage Using Convolutional Neural Network Models DOI Creative Commons

Zhurong Liu

Heliyon, Journal Year: 2024, Volume and Issue: 11(1), P. e40986 - e40986

Published: Dec. 6, 2024

Language: Английский

Citations

2