DIVERSE: A Dataset of YouTube Video Comment Stances with a Data Programming Model DOI
Iain J. Cruickshank, Lynnette Hui Xian Ng,

Amir Soofi

et al.

2021 IEEE International Conference on Big Data (Big Data), Journal Year: 2024, Volume and Issue: unknown, P. 2080 - 2089

Published: Dec. 15, 2024

Language: Английский

ChatGPT for Text Annotation? Mind the Hype! DOI Open Access
Étienne Ollion, Rubing Shen, Ana Macanovic

et al.

Published: Oct. 4, 2023

In the past months, researchers have enthusiastically discussed relevance of zero- or few-shot classifiers like ChatGPT for text annotation. Should these models prove to be performant, they would open up new continents research, and beyond. To assess merits limits this approach, we conducted a systematic literature review. Reading all articles doing zero annotation in human social sciences, found that few- shot learners offer enticing, yet mixed results on tasks. The performance scores can vary widely, with some being average very low. Besides, are often outperformed by fine-tuned annotations. Our findings thus suggest that, date, evidence about their effectiveness remains partial, but also use raises several important questions reproducibility results, privacy copyright issues, primacy English language. While definitely believe there numerous ways harness powerful technology productively, need it without falling hype.

Language: Английский

Citations

15

Can Large Language Models Aid in Annotating Speech Emotional Data? Uncovering New Frontiers [Research Frontier] DOI
Siddique Latif, Muhammad Usama,

Muhammad Ibrahim Malik

et al.

IEEE Computational Intelligence Magazine, Journal Year: 2025, Volume and Issue: 20(1), P. 66 - 77

Published: Jan. 17, 2025

Language: Английский

Citations

0

A systematic review of automated hyperpartisan news detection DOI Creative Commons

M. Maggini,

Davide Bassi, P Piot

et al.

PLoS ONE, Journal Year: 2025, Volume and Issue: 20(2), P. e0316989 - e0316989

Published: Feb. 21, 2025

Hyperpartisan news consists of articles with strong biases that support specific political parties. The spread such increases polarization among readers, which threatens social unity and democratic stability. Automated tools can help identify hyperpartisan in the daily flood articles, offering a way to tackle these problems. With recent advances machine learning deep learning, there are now more methods available address this issue. This literature review collects organizes different used previous studies on detection. Using PRISMA methodology, we reviewed systematized approaches datasets from 81 published January 2015 2024. Our analysis includes several steps: differentiating detection similar tasks, identifying text sources, labeling methods, evaluating models. We found some key gaps: is no clear definition hyperpartisanship Computer Science, most English, highlighting need for minority languages. Moreover, tendency models perform better than traditional but Large Language Models’ (LLMs) capacities domain have been limitedly studied. paper first systematically detection, laying solid groundwork future research.

Language: Английский

Citations

0

Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligence DOI Creative Commons
Andres Karjus

Humanities and Social Sciences Communications, Journal Year: 2025, Volume and Issue: 12(1)

Published: Feb. 28, 2025

Language: Английский

Citations

0

Automatic theme and motif identification in large-scale English literary corpora using deep learning approaches DOI

Dandan Yang

Journal of Computational Methods in Sciences and Engineering, Journal Year: 2025, Volume and Issue: unknown

Published: May 8, 2025

The identification of themes and motifs in literary texts is a fundamental aspect analysis, traditionally performed through manual annotation expert interpretation. However, the increasing availability large-scale English corpora presents new challenges opportunities for automated analysis. This paper proposes deep learning (DL)-based framework automatically detecting extensive collections. dataset comprises diverse sources, including classic literature, modern fiction, poetry, ensuring broad representation thematic structures. A rigorous preprocessing pipeline applied, involving stop word removal tokenization to refine textual data. For feature extraction, Word2Vec utilized capture semantic relationships between words. core novelty this research lies implementation Duelist Algorithm-optimized Bi-directional Long Short-Term Memory (DAO-BiLSTM) model, which enhances model’s ability detect classify recurring elements with high accuracy. proposed method achieves an accuracy 96.24%, recall 97.32%, precision 95.6%, F1-score 94.7%, demonstrating superior performance over existing methods. model implemented Python 3.9 using TensorFlow high-performance computing environment, efficient processing Experimental results illustrate effectiveness approach identifying complex across various genres. These findings highlight potential DL augmenting enabling large-scale, data-driven exploration that complements traditional human-driven methodologies.

Language: Английский

Citations

0

Advanced Computational Methods for News Classification: A Study in Neural Networks and CNN integrated with GPT DOI Creative Commons
Fahim Sufi

Journal of Economy and Technology, Journal Year: 2024, Volume and Issue: unknown

Published: Sept. 1, 2024

Language: Английский

Citations

3

Developing a Natural Language Understanding Model to Characterize Cable News Bias DOI Creative Commons
Seth Benson, Iain J. Cruickshank

IEEE Access, Journal Year: 2024, Volume and Issue: 12, P. 31798 - 31807

Published: Jan. 1, 2024

Media bias has been extensively studied by both social and computational sciences. However, current work still a large reliance on human input subjective assessment to label biases. This is especially true for cable news, which continued presence in American media but lack of text-based identification research. To address these issues, we develop an unsupervised machine learning method characterize the news programs without any input. relies analysis what topics are mentioned through Named Entity Recognition how those discussed Stance Analysis order cluster with similar biases together. Applying our 2020 transcripts, find that tend together consistently over time roughly correspond network program. reveals potential future tools more objectively assess unfamiliar environments, empirical results insight into nature programs.

Language: Английский

Citations

2

Chain of Stance: Stance Detection with Large Language Models DOI
Junxia Ma, Changjiang Wang,

Hanwen Xing

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 82 - 94

Published: Oct. 31, 2024

Language: Английский

Citations

2

Human Evaluation in Large Language Model Testing DOI

H. M Dharmendra,

G. Raghunandan,

A. N. Sindhu

et al.

Advances in computational intelligence and robotics book series, Journal Year: 2024, Volume and Issue: unknown, P. 553 - 574

Published: Sept. 20, 2024

LLMs excel in language tasks, but testing them effectively is tricky. Automated metrics help, human evaluation crucial for aspects like clarity, relevance, and ethics. This chapter explores methods challenges of LLM testing, including factors fairness user experience. The authors discuss a sample method highlight ongoing efforts robust to ensure responsible development. Finally, they explore the use cybersecurity, showcasing their potential challenges.

Language: Английский

Citations

1

Social Media Profiling for Political Affiliation Detection DOI Creative Commons
Ihsan Ullah Khan, Muhammad Usman Shahid Khan

Human-Centric Intelligent Systems, Journal Year: 2024, Volume and Issue: 4(3), P. 437 - 446

Published: July 20, 2024

Abstract The notion of discerning political affiliations from users’ social media behavior instills a sense unease in many. Democracy necessitates that individuals’ remain private, and challenges this foundational principle democracy. This study uses BERT, pre-trained language model to analyze X’s (formally Twitter) users their understand how much it is easy now find the affiliation people. We collect posts both English Urdu languages different leaders followers, which are used fine-tune BERT model. classifies profiles into Pro, Neutral, or Anti-government classes. To assess performance proposed method, experiments conducted evaluate its accuracy, efficiency, effectiveness. findings confirm hypothesis detect individuals using with high accuracy (69% for 94% language) can undermine

Language: Английский

Citations

0