Prediction of Diabetes Disease Based on Stacking Ensemble Using Oversampling Method and Hyperparameters DOI
Alfredo Daza Vergaray,

Carlos Fidel Ponce Sánchez,

Oscar Gonzalo Apaza Pérez

и другие.

Опубликована: Янв. 1, 2023

Background: Diabetes is a very common disease today and has acquired worrying focus in the field of public health globally, fact, it estimated that number people with diabetes worldwide reached 415 million.Objective: Propose method 4 combined models based on Stacking order to predict diabetes. In addition, web interface was developed best model proposed this study.Methods: The dataset collected from Dataset composed 768 patient records used. data then pre-processed using Python programming language. To balance data, divided into values an oversampling applied distribute proportionally. Then, divisions were made balanced cross-validation for training, calibrated. Regarding development base algorithms, 7 independent algorithms used, proposed, finally obtain evaluation their respective metrics.Results: 1A (Logistic regression) Oversampling value Accuracy=91.50%, Sensitivity=91.60%, F1-Score=91.49% Precision= 91.50%, while respect metric ROC Curve, Oversampling, 2A (Random Forest) oversampling, Random Forest (Independent) percentage, being 97.00%.Conclusions: Implementing stacking method, helps make adequate diagnosis Therefore, by improvement prediction observed, surpassing performance

Язык: Английский

A Novel Approach for Predicting the Survival of Colorectal Cancer Patients Using Machine Learning Techniques and Advanced Parameter Optimization Methods DOI Open Access

Andrzej Woźniacki,

Wojciech Książek, Patrycja Mrowczyk

и другие.

Cancers, Год журнала: 2024, Номер 16(18), С. 3205 - 3205

Опубликована: Сен. 20, 2024

Background: Colorectal cancer is one of the most prevalent forms and associated with a high mortality rate. Additionally, an increasing number adults under 50 are being diagnosed disease. This underscores importance leveraging modern technologies, such as artificial intelligence, for early diagnosis treatment support. Methods: Eight classifiers were utilized in this research: Random Forest, XGBoost, CatBoost, LightGBM, Gradient Boosting, Extra Trees, k-nearest neighbor algorithm (KNN), decision trees. These algorithms optimized using frameworks Optuna, RayTune, HyperOpt. study was conducted on public dataset from Brazil, containing information tens thousands patients. Results: The models developed demonstrated classification accuracy predicting one-, three-, five-year survival, well overall cancer-specific mortality. Forest delivered best performance, achieving approximately 80% across all evaluated tasks. Conclusions: research enabled development effective that can be applied clinical practice.

Язык: Английский

Процитировано

9

Stacking ensemble approach to diagnosing the disease of diabetes DOI Creative Commons
Alfredo Daza Vergaray,

Carlos Fidel Ponce Sánchez,

Gonzalo Apaza-Perez

и другие.

Informatics in Medicine Unlocked, Год журнала: 2023, Номер 44, С. 101427 - 101427

Опубликована: Дек. 12, 2023

Diabetes is a very common disease today and has acquired worrying focus in the field of public health globally, fact, it estimated that number people with diabetes worldwide reached 415 million. Propose method 4 combined models based on Stacking ensemble to diagnose Diabetes. In addition, web interface was developed best model proposed this study. The dataset collected from Dataset composed 768 patient records used. data then pre-processed using Python programming language. To balance data, divided into values an oversampling applied distribute proportionally. Then, divisions were made balanced cross-validation for training, calibrated. Regarding development base algorithms, 7 independent algorithms used, proposed, finally obtain evaluation their respective metrics. 1A (Logistic regression) Oversampling value Accuracy = 91.5 %, Sensitivity 91.6 F1-Score 91.49 % Precision while respect metric ROC Curve, Oversampling, 2A (Random Forest) oversampling, Random Forest (Independent) percentage, being 97 %. Implementing stacking method, helps make adequate diagnosis diabetes. Therefore, by improvement prediction observed, surpassing performance

Язык: Английский

Процитировано

13

Sentiment Analysis on E-Commerce Product Reviews Using Machine Learning and Deep Learning Algorithms: A Bibliometric Analysis, Systematic Literature Review, Challenges and Future Works DOI Creative Commons
Alfredo Daza Vergaray,

Néstor Daniel González Rueda,

Mirelly Sonia Aguilar Sánchez

и другие.

International Journal of Information Management Data Insights, Год журнала: 2024, Номер 4(2), С. 100267 - 100267

Опубликована: Июль 18, 2024

Язык: Английский

Процитировано

4

Software Defect Prediction Based on a Multiclassifier with Hyperparameters: Future Work DOI Creative Commons
Alfredo Daza Vergaray

Results in Engineering, Год журнала: 2025, Номер unknown, С. 104123 - 104123

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

0

Predicting the Repayment Decisions of Korean Vulnerable Debtors: Evidence from an Empirical Study Utilizing a Stacking Algorithm DOI
Youngwoo Jeong

Computational Economics, Год журнала: 2025, Номер unknown

Опубликована: Фев. 12, 2025

Язык: Английский

Процитировано

0

Systematic review of machine learning techniques to predict anxiety and stress in college students DOI Creative Commons
Alfredo Daza Vergaray, Nemías Saboya, Jorge Isaac Necochea-Chamorro

и другие.

Informatics in Medicine Unlocked, Год журнала: 2023, Номер 43, С. 101391 - 101391

Опубликована: Янв. 1, 2023

Anxiety is considered one of the most common pathologies that people go through frequently, this being main cause illness and disability in students since it more women with 7.7% than men 3.6%. Moreover, stress also causes some health-related problems, such as cardiovascular diseases mental disorders. The purpose study to gain a deeper understanding methodologies, attributes, selection algorithms, well techniques, tools or programming languages, metrics machine learning algorithms have been applied prediction anxiety college students. An exhaustive search 29 articles was performed, using keywords from 7 databases: ScienceDirect, IEEE Xplore, ACM, Scopus, Springer Link, InderScience Wiley 2019 2023. This article based on Preferred Reporting Items for Systematic Reviews Meta-Analyses (PRISMA) methodology, taking into account inclusion exclusion criteria. To then make synthesis findings studies about following aspects languages metrics. methodology used sequence steps, important attributes were age gender, do not use variable techniques; other hand, efficient techniques Support Vector Machine (SVM) Logistic regression (LR), language develop models Python finally essential determine effectiveness model Precision Accuracy. systematic review provides scientific evidence, results describing how help predict stress. For this, are compared perform broad analysis these Programming metrics, variables influential factors, which will medical fields detection

Язык: Английский

Процитировано

6

Predictive Modeling of Anxiety Levels in Bangladeshi University Students: A Voting-Based Approach with LIME and SHAP Explanations DOI

Mohammad Tanvirul Islam,

Kahakashan Ashraf,

Md.Hamid Hosen

и другие.

Опубликована: Март 8, 2024

In today's developing world, anxiety is a common mental disorder among university students. this work, we predict in students using voting classifier. We have applied explainable artificial intelligence (XAI), to gain better understanding of the machine learning model, Google Form, dataset was gathered from several public, private, and national universities Bangladesh. compared algorithms 20 selected features without feature selection. By concept voting, created new model. order create our final best three ML based on their accuracy. The classifier has highest accuracy 96%, while F1 Recall scores are Precision 97%. LIME SHAP models used explain model predictions instead black-box study determines levels by particular observations, allowing for transparency comprehension. goal prediction detect at-risk individuals root causes students, consequently reducing detrimental consequences academic performance well-being.

Язык: Английский

Процитировано

1

Fluoride contamination in African groundwater: Predictive modeling using stacking ensemble techniques DOI

Usman Sunusi Usman,

Yousif Hassan Mohamed Salh,

Bing Yan

и другие.

The Science of The Total Environment, Год журнала: 2024, Номер 957, С. 177693 - 177693

Опубликована: Ноя. 25, 2024

Язык: Английский

Процитировано

1

Machine learning model for the prediction of landslides due to the “El Niño” phenomenon in Peruvian educational institutions DOI

Ronald Edward Mansilla Musaja,

Antonio Arroyo-Paz

Опубликована: Дек. 13, 2023

This study develops a machine learning model to predict landslides induced by the "El Niño" phenomenon in educational institutions Peru. We use dataset of 55,335 records from National Center for Estimation, Prevention and Reduction Disaster Risk (CENEPRED), including geographic vulnerability characteristics. The focuses on assessing landslide risk, considering variables such as latitude, longitude, evacuation plans institutions, well their susceptibility mass movements. Machine algorithms have been used that may affect infrastructure, obtaining result accuracy Random Forest with 86.23% accuracy, Decision Tree 83.19%, KNN 69.85%, MultiLayer Perceptron 41.49%, concluding Algorithm has best accuracy.

Язык: Английский

Процитировано

2

Machine Learning and Deep Learning Techniques to Predict Software Defects: A Bibliometric Analysis, Systematic Review, Challenges and Future Works DOI
Alfredo Daza Vergaray,

Oscar Gonzalo Apaza Pérez,

Jhon Alexander Zagaceta Daza

и другие.

Опубликована: Янв. 1, 2024

In Australia, approximately 66.00% of projects exceeded the programmed budget and 33% were out time, all them due to software failures.The purpose this study is gain a deeper understanding quartiles, countries, keywords, techniques, metrics, tools, platforms or languages, variables, data source dataset that have been used in predicting defects. A comprehensive search 55 articles was conducted, using keywords from 5 databases: Scopus, ProQuest, ScienceDirect, Ebscohost, Web Science 2019 2023. This article based on PRISMA (Preferred Reporting Items for Systematic Reviews Meta-Analysis) methodology, taking into account inclusion exclusion criteria. To then make synthesis findings studies following aspects such as dataset.The most techniques Support Vector Machine (SVM) Random Forest (RF), along with Accuracy F1-Score programming language Python, prominent variables Kilo (thousands) lines code (KLOC) Cyclomatic complexity, finally NASA's Metrics Data Program Repository dasource range minimum 759 instances 37 attributes maximum 3579 38 projects: CM1, MW1, PC1, PC3 PC4. systematic review provides scientific evidence, results describing how machine learning help predict

Язык: Английский

Процитировано

0