Categorizing E-Cigarette-related Tweets using BERT Topic Modeling DOI Creative Commons

D. Murthy,

Shuvam Keshari,

Sonia Arora

et al.

Emerging Trends in Drugs Addictions and Health, Journal Year: 2024, Volume and Issue: 4, P. 100160 - 100160

Published: Oct. 5, 2024

Language: Английский

Generative artificial intelligence and machine learning methods to screen social media content DOI Creative Commons
Kellen Sharp, Rachel R. Ouellette, Ranjan K. Singh

et al.

PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2710 - e2710

Published: March 14, 2025

Background Social media research is confronted by the expansive and constantly evolving nature of social data. Hashtags keywords are frequently used to identify content related a specific topic, but these search strategies often result in large numbers irrelevant results. Therefore, methods needed quickly screen based on question. The primary objective this article present generative artificial intelligence (AI; e.g ., ChatGPT) machine learning from platforms. As proof concept, we apply TikTok e-cigarette use during pregnancy. Methods We searched for pregnancy vaping using 70 hashtag pairs “pregnancy” “vaping” ( #pregnancytok #ecigarette) obtain 11,673 distinct posts. extracted post videos, descriptions, metadata Zeeschuimer PykTok library. To enhance textual analysis, employed automatic speech recognition via Whisper system transcribe verbal each video. Next, OpenCV library extract frames followed object text detection analysis Oracle Cloud Vision. Finally, merged all data create consolidated dataset entered into ChatGPT-4 determine which posts refine ChatGPT prompt content, human coder cross-checked ChatGPT-4’s outputs 10 out every 100 entries, with errors inform final prompt. was evaluated through review, confirming that contain “vape” comparing determinations those made ChatGPT. Results Our results indicated classified 44.86% videos as exclusively pregnancy, 36.91% vaping, 8.91% containing both topics. A reviewer confirmed 45.38% identified relevant content. Human review 10% screened 99.06% agreement rate excluded Conclusions has mixed capacity been converted techniques such detection. ChatGPT’s sensitivity found be lower than current case example demonstrated power screening can an initial pass at Future studies should explore ways sensitivity.

Language: Английский

Citations

0

Categorizing E-Cigarette-related Tweets using BERT Topic Modeling DOI Creative Commons

D. Murthy,

Shuvam Keshari,

Sonia Arora

et al.

Emerging Trends in Drugs Addictions and Health, Journal Year: 2024, Volume and Issue: 4, P. 100160 - 100160

Published: Oct. 5, 2024

Language: Английский

Citations

1