A Machine Learning-Based Computational Methodology for Predicting Acute Respiratory Infections Using Social Media Data DOI Creative Commons
José Manuel Ramos-Varela, Juan C. Cuevas‐Tello, Daniel E. Noyola

et al.

Computation, Journal Year: 2025, Volume and Issue: 13(4), P. 86 - 86

Published: March 25, 2025

We study the relationship between tweets referencing Acute Respiratory Infections (ARI) or COVID-19 symptoms and confirmed cases of these diseases. Additionally, we propose a computational methodology for selecting applying Machine Learning (ML) algorithms to predict public health indicators using social media data. To achieve this, novel pipeline was developed, integrating three distinct models ARI COVID-19. The dataset contains related respiratory diseases, published 2020 2022 in state San Luis Potosí, Mexico, obtained via Twitter API (now X). is composed stages, it involves tools such as Dataiku Python with ML libraries. first two stages focuses on identifying best-performing predictive models, while third stage includes Natural Language Processing (NLP) tweet selection. One our key findings that contributed improved predictions but did not enhance time series predictions. NLP approach combination Word2Vec algorithm KMeans model Furthermore, both by 3% second half when were included feature, where best prediction DeepAR.

Language: Английский

A Machine Learning-Based Computational Methodology for Predicting Acute Respiratory Infections Using Social Media Data DOI Creative Commons
José Manuel Ramos-Varela, Juan C. Cuevas‐Tello, Daniel E. Noyola

et al.

Computation, Journal Year: 2025, Volume and Issue: 13(4), P. 86 - 86

Published: March 25, 2025

We study the relationship between tweets referencing Acute Respiratory Infections (ARI) or COVID-19 symptoms and confirmed cases of these diseases. Additionally, we propose a computational methodology for selecting applying Machine Learning (ML) algorithms to predict public health indicators using social media data. To achieve this, novel pipeline was developed, integrating three distinct models ARI COVID-19. The dataset contains related respiratory diseases, published 2020 2022 in state San Luis Potosí, Mexico, obtained via Twitter API (now X). is composed stages, it involves tools such as Dataiku Python with ML libraries. first two stages focuses on identifying best-performing predictive models, while third stage includes Natural Language Processing (NLP) tweet selection. One our key findings that contributed improved predictions but did not enhance time series predictions. NLP approach combination Word2Vec algorithm KMeans model Furthermore, both by 3% second half when were included feature, where best prediction DeepAR.

Language: Английский

Citations

0