A Ensemble Model for PM2.5 Concentration Prediction Based on Feature Selection and Two-Layer Clustering Algorithm DOI Open Access
Xiaoxuan Wu, Qiang Wen, Junxing Zhu

et al.

Published: Aug. 18, 2023

Determining accurate PM2.5 pollution concentrations and understanding their dynamic patterns is crucial for scientifically informed air control strategies. Traditional reliance on linear correlation coefficients ascertaining related factors only uncovers superficial relationships. Moreover, the invariance of conventional prediction models restricts accuracy. To enhance precision concentration prediction, this study introduces a novel integrated model that leverages feature selection clustering algorithm. Comprising three components - selection, clustering, first employs non-dominated sorting Genetic Algorithm (NSGA-III) to identify most impactful features affecting within pollutants meteorological factors. This step offers more valuable data subsequent modules. The then adopts two-layer method (SOM+K-means) analyze multifaceted irregularity dataset. Finally, establishes Extreme Learning Machine (ELM) weak learner each classification, integrating multiple learners using Adaboost algorithm obtain comprehensive model. Through enhancement, exploration, adaptability improvement, proposed significantly enhances overall performance. Data sourced from 12 Beijing-based monitoring sites in 2016 were utilized an empirical study, model's results compared with five other predictive models. outcomes demonstrate heightens accuracy, offering useful insights potential broadened application multifactor methodologies pollutants.

Language: Английский

Predictive Model with Machine Learning for Environmental Variables and PM2.5 in Huachac, Junín, Perú DOI Creative Commons

Emery Olarte,

José Antonio Gutiérrez, Gerardo Roque

et al.

Atmosphere, Journal Year: 2025, Volume and Issue: 16(3), P. 323 - 323

Published: March 12, 2025

PM2.5 pollution is increasing, causing health problems. The objective of this study was to model the behavior PM2.5AQI (air quality index) using machine learning (ML) predictive models linear regression, lasso, ridge, and elastic net. A total 16,543 records from Huachac, Junin area in Peru were used with regressors humidity % temperature °C. focus environmental variables. Methods: Exploratory data analysis (EDA) applied. Results: has high values winter spring, averages 52.6 36.9, respectively, low summer, a maximum value September (spring) minimum February (summer). use regression produced precise metrics choose best for prediction PM2.5AQI. Comparison other research highlights robustness chosen ML models, underlining potential Conclusions: found α = 0.1111111 Lambda λ 0.150025, represented by 83.0846522 − 10.302222000 (Humidity) 0.1268124 (Temperature). an adjusted R2 0.1483206 RMSE 25.36203, it allows decision making care environment.

Language: Английский

Citations

0

A Ensemble Model for PM2.5 Concentration Prediction Based on Feature Selection and Two-Layer Clustering Algorithm DOI Open Access
Xiaoxuan Wu, Qiang Wen, Junxing Zhu

et al.

Published: Aug. 18, 2023

Determining accurate PM2.5 pollution concentrations and understanding their dynamic patterns is crucial for scientifically informed air control strategies. Traditional reliance on linear correlation coefficients ascertaining related factors only uncovers superficial relationships. Moreover, the invariance of conventional prediction models restricts accuracy. To enhance precision concentration prediction, this study introduces a novel integrated model that leverages feature selection clustering algorithm. Comprising three components - selection, clustering, first employs non-dominated sorting Genetic Algorithm (NSGA-III) to identify most impactful features affecting within pollutants meteorological factors. This step offers more valuable data subsequent modules. The then adopts two-layer method (SOM+K-means) analyze multifaceted irregularity dataset. Finally, establishes Extreme Learning Machine (ELM) weak learner each classification, integrating multiple learners using Adaboost algorithm obtain comprehensive model. Through enhancement, exploration, adaptability improvement, proposed significantly enhances overall performance. Data sourced from 12 Beijing-based monitoring sites in 2016 were utilized an empirical study, model's results compared with five other predictive models. outcomes demonstrate heightens accuracy, offering useful insights potential broadened application multifactor methodologies pollutants.

Language: Английский

Citations

2