Clickbait Detection in Indonesia Headline News Using Indobert and Roberta DOI Creative Commons

Muhammad Edo Syahputra,

Ade Putera Kemala,

Dimas Ramdhan

и другие.

Jurnal Riset Informatika, Год журнала: 2023, Номер 5(3), С. 425 - 430

Опубликована: Июнь 23, 2023

This paper explores clickbait detection using Transformer models, specifically IndoBERT and RoBERTa. The objective is to leverage the models for accuracy by employing balancing augmentation techniques on dataset. research demonstrates benefit of in improving model performance. Additionally, data also improved performance However, it resulted differently with slightly decreased These findings underline importance considering selection dataset characteristics when applying augmentation. Based result, IndoBERT, a balanced distribution, outperformed previous study other used this research. three distribution settings: unbalanced, balanced, augmented 8513, 6632, 15503 total counts, respectively. Furthermore, incorporating techniques, surpasses studies, contributing advancement accuracy, 95% f1-score unbalanced distribution. method only RoBERTa model. Moreover, might be boosted gathering more varied datasets. work highlights value leveraging pre-trained specific dataset-handling techniques. implications include necessity accurate varying impact different models. insights aid researchers practitioners making informed decisions tasks, benefiting content moderation, online user experience, information reliability. emphasizes significance utilizing state-of-the-art tailored approaches improve

Язык: Английский

An Ensemble-Based Approach for Detecting Clickbait in Indonesian Online Media DOI Creative Commons
Sandy Vitria Kurniawan, Adhe Setya Pramayoga, Yeva Fadhilah Ashari

и другие.

JURNAL MASYARAKAT INFORMATIKA, Год журнала: 2025, Номер 16(1), С. 104 - 118

Опубликована: Май 30, 2025

Clickbait headlines are widely used in online media to attract readers through exaggerated or misleading titles, potentially leading user dissatisfaction and information overload. This study proposes a machine learning approach for detecting clickbait Indonesian news using classical classification models ensemble learning. The dataset consists of labeled non-clickbait Bahasa Indonesia, which were processed represented TF-IDF vectorization. Three base classifiers, Multinomial Naive Bayes, Logistic Regression, Support Vector Machine, integrated soft voting stacking methods. experimental results indicate that the model achieved highest accuracy 0.7728, while recorded best F1-score 0.7080, outperforming individual classifiers. Despite these gains, SVM demonstrated most substantial decline after stopwords removal, dropping by 0.0410. These findings highlight effectiveness enhancing detection performance suggest potential further optimization selection integration strategies.

Язык: Английский

Процитировано

0

Mind the Label Shift of Augmentation-based Graph OOD Generalization DOI
Junchi Yu, Jian Liang, Ran He

и другие.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Год журнала: 2023, Номер unknown, С. 11620 - 11630

Опубликована: Июнь 1, 2023

Out-of-distribution (OOD) generalization is an important issue for Graph Neural Networks (GNNs). Recent works employ different graph editions to generate augmented environments and learn invariant GNN generalization. However, the label shift usually occurs in augmentation since structural edition inevitably alters label. This brings inconsistent predictive relationships among environments, which harmful To address this issue, we propose LiSA, generates label-invariant augmentations facilitate OOD Instead of resorting editions, LiSA exploits Label-invariant Subgraphs training graphs construct Augmented environments. Specifically, first designs variational subgraph generators extract locally patterns multiple subgraphs efficiently. Then, produced by are collected build promote diversity further introduces a tractable energy-based regularization enlarge pair-wise distances between distributions In manner, diverse with consistent relationship facilitates learning GNN. Extensive experiments on node-level graph-level benchmarks show that achieves impressive performance backbones. Code available https://github.com/Samyu0304/LiSA.

Язык: Английский

Процитировано

7

CA-CD: context-aware clickbait detection using new Chinese clickbait dataset with transfer learning method DOI
Hei‐Chia Wang, Martinus Maslim, Hung‐Yu Liu

и другие.

Data Technologies and Applications, Год журнала: 2023, Номер 58(2), С. 243 - 266

Опубликована: Авг. 29, 2023

Purpose A clickbait is a deceptive headline designed to boost ad revenue without presenting closely relevant content. There are numerous negative repercussions of clickbait, such as causing viewers feel tricked and unhappy, long-term confusion, even attracting cyber criminals. Automatic detection algorithms for have been developed address this issue. The fact that there only one semantic representation the same term limited dataset in Chinese need existing technologies detecting clickbait. This study aims solve limitations automated dataset. Design/methodology/approach combines both train model capture probable relationship between news headlines In addition, part-of-speech elements used generate most appropriate detection, improving performance. Findings research successfully compiled containing up 20,896 articles. collection contains headlines, articles, categories supplementary metadata. suggested context-aware (CA-CD) outperforms approaches on many criteria, demonstrating proposed strategy's efficacy. Originality/value originality resides newly contextual representation-based approach employing transfer learning. method can modify each word based context assist more precisely interpreting original meaning

Язык: Английский

Процитировано

4

An Attention-Based Neural Network Using Human Semantic Knowledge and Its Application to Clickbait Detection DOI Creative Commons
Wei Feng, Uyen Trang Nguyen

IEEE Open Journal of the Computer Society, Год журнала: 2022, Номер 3, С. 217 - 232

Опубликована: Янв. 1, 2022

Clickbait is a commonly used social engineering technique to carry out phishing attacks, illegitimate marketing, and dissemination of disinformation. As result, clickbait detection has become popular research topic in recent years due the prevalence on web media. In this article, we propose novel attention-based neural network for task detection. To best our knowledge, work first that incorporates human semantic knowledge into an artificial network, uses linguistic graphs guide attention mechanisms task. Extensive experimental results show proposed model outperforms existing state-of-the-art classifiers, even when training data limited. The also performs better or comparably powerful pre-trained models, namely, BERT, RoBERTa, XLNet, while being much more lightweight. Furthermore, conducted experiments demonstrate use can significantly enhance performance models semi-supervised domain such as XLNet.

Язык: Английский

Процитировано

4

Clickbait Detection in Indonesia Headline News Using IndoBERT and RoBERTa DOI Creative Commons

Muhammad Edo Syahputra,

Ade Putera Kemala,

Dimas Ramdhan

и другие.

Jurnal Riset Informatika, Год журнала: 2023, Номер 5(3), С. 425 - 430

Опубликована: Июнь 10, 2023

This paper explores clickbait detection using Transformer models, specifically IndoBERT and RoBERTa. The objective is to leverage the models for accuracy by employing balancing augmentation techniques on dataset. research demonstrates benefit of in improving model performance. Additionally, data also improved performance However, it resulted differently with slightly decreased These findings underline importance considering selection dataset characteristics when applying augmentation. Based result, IndoBERT, a balanced distribution, outperformed previous study other used this research. Furthermore, incorporating techniques, surpasses studies, contributing advancement accuracy. work highlights value leveraging pre-trained specific dataset-handling techniques. implications include necessity accurate varying impact different models. insights aid researchers practitioners making informed decisions tasks, benefiting content moderation, online user experience, information reliability. emphasizes significance utilizing state-of-the-art tailored approaches improve

Язык: Английский

Процитировано

2

A Pooled RNN-based Deep Learning Model based on Data Augmentation for Clickbait Detection DOI

Jeong-Jae Kim,

Sang-Min Park, Byung-Won On

и другие.

The Journal of Korean Institute of Information Technology, Год журнала: 2023, Номер 21(4), С. 45 - 56

Опубликована: Апрель 27, 2023

최근 가짜 뉴스 탐지 문제는 데이터 공학에서 가장 시급한 문제 중 하나이다. 본 논문에서는 본질적인 문제인 낚시성 기사 문제를 해결하기 위해 두 가지의 새로운 접근 방법을 제안한다. 먼저, RNN 기반의 Bi-LSTM 다중 계층들, max-pooling 그리고 fully-connected 계층들로 구성된 딥러닝 모델을 또한, 모델의 정확도를 향상하기 대용량, 고품질의 학습 데이터를 자동으로 생성하는 증강 알고리즘을 제안된 알고리즘으로 생성된 데이터가 인간 평가자가 만든 데이터와 거의 일치함을 보인다. 모델은 기존 주요 방안에 비해 36% 정확도 향상시키며, 방식은 크게 높인다. 이러한 제안방안은 감지를 현재까지 시도되지 않은 연구이다.

Процитировано

1

Local explainability-based model for clickbait spoiler generation DOI
Itishree Panda, Jyoti Singh, Gayadhar Pradhan

и другие.

Journal of Computational Social Science, Год журнала: 2024, Номер 8(1)

Опубликована: Ноя. 26, 2024

Язык: Английский

Процитировано

0

Clickbait: Research, challenges and opportunities – A systematic literature review DOI Open Access
Daniel Jácobo-Morales, Mauro Marino‐Jiménez

Online Journal of Communication and Media Technologies, Год журнала: 2024, Номер 14(4), С. e202458 - e202458

Опубликована: Окт. 3, 2024

Clickbait is a concept whose research has been increasing since 2018. Four main approaches are distinguished: (1) the development of algorithms and programs to detect it, (2) semantic techniques used in headlines texts, (3) awakening curiosity audience, (4) credibility headlines. Therefore, proposed as systematic literature review with objective analyzing trends studies on clickbait Scopus Web Science databases from January 1, 2015, December 31, 2023. For this, it uses PRISMA declaration reference. That is, simple random sampling technique bibliographic analysis, according RSL guidelines. After applying inclusion criteria, obtained final sample 165 studies. Among results, stands out that Europe (n = 77) largest number works. Something similar happens English language. With 90%, one greatest dissemination. Finally, established significant themes, most widespread theories, 11 properties deepen four initial approaches, explain use term. helps delimit path for future research.

Язык: Английский

Процитировано

0

Multi-modal soft prompt-tuning for Chinese Clickbait Detection DOI
Ye Wang, Yi Zhu, Yun Li

и другие.

Neurocomputing, Год журнала: 2024, Номер 614, С. 128829 - 128829

Опубликована: Ноя. 8, 2024

Язык: Английский

Процитировано

0

Research on Stock Index Prediction Based on Stock Correlation Network and Deep Learning DOI Open Access
Xueyan Li

Academic Journal of Computing & Information Science, Год журнала: 2023, Номер 6(4)

Опубликована: Янв. 1, 2023

Aiming at the problem of stock index prediction, constructing a time series correlation network based on fundamentals and technology components, then using depth map neural to learn hierarchical representation network, obtaining candidate prediction signal in an end-to-end way. The architecture composed method strategy is called DIFFPOOL architecture. Taking CSI 300 as research object, combining with softmax classifier, long-term short-term memory (LSTM), linear regression, logical respectively, uses sliding window obtain corresponding accuracy index. combined model under optimal parameters fluctuates interval [0.56, 0.62]. Ultimately, first mock exam mean absolute error (MAE) root square (RMSE). regression models are compared LSTM, recurrent (RNN), back propagation (BP). Compared single model, MAE RMSE smaller, 0.0061 0.0081, respectively. Experiments show that by aggregating node attribute information association hierarchically, we can dynamically capture impact different industry sectors price fluctuations further improve accuracy.

Язык: Английский

Процитировано

0