Early Prediction of Diabetic Using Data Mining DOI

Fayzeh Abdulkareem Jaber,

Joy Winston James

SN Computer Science, Год журнала: 2023, Номер 4(2)

Опубликована: Янв. 17, 2023

Язык: Английский

Diabetes mellitus prediction and diagnosis from a data preprocessing and machine learning perspective DOI Creative Commons
Chollette C. Olisah, Lyndon Smith, Melvyn Smith

и другие.

Computer Methods and Programs in Biomedicine, Год журнала: 2022, Номер 220, С. 106773 - 106773

Опубликована: Март 31, 2022

Diabetes mellitus is a metabolic disorder characterized by hyperglycemia, which results from the inadequacy of body to secrete and respond insulin. If not properly managed or diagnosed on time, diabetes can pose risk vital organs such as eyes, kidneys, nerves, heart, blood vessels so be life-threatening. The many years research in computational diagnosis have pointed machine learning viable solution for prediction diabetes. However, accuracy rate date suggests that there still much room improvement. In this paper, we are proposing framework using PIMA Indian dataset laboratory Medical City Hospital (LMCH) dataset. We hypothesize adopting feature selection missing value imputation methods scale up performance classification models diagnosis.In robust building model aid clinical proposed. includes adoption Spearman correlation polynomial regression imputation, respectively, perspective strengthens their performances. Further, different supervised models, random forest (RF) model, support vector (SVM) our designed twice-growth deep neural network (2GDNN) proposed classification. optimized tuning hyperparameters grid search repeated stratified k-fold cross-validation evaluated ability problem.Through experiments LMCH datasets, precision, sensitivity, F1-score, train-accuracy, test-accuracy scores 97.34%, 97.24%, 97.26%, 99.01%, 97.25 97.28%, 97.33%, 97.27%, 99.57%, 97.33, achieved with 2GDNN respectively.The data preprocessing approaches classifiers hyperparameter optimization within yield outperforms state-of-the-art diagnosis. source code has been made publicly available.

Язык: Английский

Процитировано

131

Hybrid stacked ensemble combined with genetic algorithms for diabetes prediction DOI Open Access
Jafar Abdollahi, Babak Nouri-Moghaddam

Iran Journal of Computer Science, Год журнала: 2022, Номер 5(3), С. 205 - 220

Опубликована: Март 21, 2022

Язык: Английский

Процитировано

76

Diabetes prediction model using data mining techniques DOI Creative Commons

Rashi Rastogi,

Mamta Bansal

Measurement Sensors, Год журнала: 2022, Номер 25, С. 100605 - 100605

Опубликована: Дек. 5, 2022

Diabetes is the leading cause of death in world, and it also affects kidney disease, loss vision, heart disease. Data mining techniques contribute to health care decisions for accurate disease diagnosis treatment, reducing workload experts. prediction a rapidly expanding field research. Early diabetes will result improved treatment. causes variety issues. Therefore, critical prevent, monitor, raise awareness about it. Type 1 2 can renal problems, eye difficulties. In this paper, we propose model using data techniques. We apply four such as Random Forest, Support Vector Machine (SVM), Logistic Regression, Naive Bayes. The proposed mechanism trained Python analysed with real dataset, which collected from Kaggle. Furthermore, performance confusion matrix, sensitivity accuracy metrices. logistic regression, high, i.e., 82.46%, comparison other

Язык: Английский

Процитировано

72

Effective Handling of Missing Values in Datasets for Classification Using Machine Learning Methods DOI Creative Commons

Ashokkumar Palanivinayagam,

Robertas Damaševičius

Information, Год журнала: 2023, Номер 14(2), С. 92 - 92

Опубликована: Фев. 3, 2023

The existence of missing values reduces the amount knowledge learned by machine learning models in training stage thus affecting classification accuracy negatively. To address this challenge, we introduce use Support Vector Machine (SVM) regression for imputing values. Additionally, propose a two-level process to reduce number false classifications. Our evaluation proposed method was conducted using PIMA Indian dataset diabetes classification. We compared performance five different models: Naive Bayes (NB), (SVM), k-Nearest Neighbours (KNN), Random Forest (RF), and Linear Regression (LR). results our experiments show that SVM classifier achieved highest 94.89%. RF had precision (98.80%) recall (85.48%). NB model F1-Score (95.59%). provides promising solution detecting at an early addressing issue dataset. can notably improve This work valuable contribution field research highlights importance applications.

Язык: Английский

Процитировано

43

Prediction of Diabetes Complications Using Computational Intelligence Techniques DOI Creative Commons
Turki Alghamdi

Applied Sciences, Год журнала: 2023, Номер 13(5), С. 3030 - 3030

Опубликована: Фев. 27, 2023

Diabetes is a complex disease that can lead to serious health complications if left unmanaged. Early detection and treatment of diabetes crucial, data analysis predictive techniques play significant role. Data mining techniques, such as classification prediction models, be used analyse various aspects related diabetes, extract useful information for early the disease. XGBoost classifier machine learning algorithm effectively predicts with high accuracy. This uses gradient-boosting framework handle large datasets high-dimensional features. However, it important note choice best predicting may depend on specific characteristics research question being addressed. In addition also identify risk factors its complications, monitor progression, evaluate effectiveness treatments. These provide valuable insights into underlying mechanisms help healthcare providers make informed decisions about patient care. have potential significantly improve management fast-growing chronic notable hazards. The showed most effectiveness, an accuracy rate 89%.

Язык: Английский

Процитировано

24

A Proposed Technique Using Machine Learning for the Prediction of Diabetes Disease through a Mobile App DOI Creative Commons
Hosam F. El-Sofany, Samir Abou El-Seoud, Omar H. Karam

и другие.

International Journal of Intelligent Systems, Год журнала: 2024, Номер 2024, С. 1 - 13

Опубликована: Янв. 9, 2024

With the increasing prevalence of diabetes in Saudi Arabia, there is a critical need for early detection and prediction disease to prevent long-term health complications. This study addresses this by using machine learning (ML) techniques applied Pima Indians dataset private datasets through implementation computerized system predicting diabetes. In contrast prior research, employs semisupervised model combined with strong gradient boosting, effectively diabetes-related features dataset. Additionally, researchers employ SMOTE technique deal problem imbalanced classes. Ten ML classification techniques, including logistic regression, random forest, KNN, decision tree, bagging, AdaBoost, XGBoost, voting, SVM, Naive Bayes, are evaluated determine algorithm that produces most accurate prediction. The proposed approach has achieved impressive performance. For dataset, XGBoost an accuracy 97.4%, F1 coefficient 0.95, AUC 0.87. datasets, it 83.1%, 0.76, 0.85. To understand how predicts final results, explainable AI SHAP methods implemented. Furthermore, demonstrates adaptability applying domain adaptation method. further enhance accessibility, mobile app been developed instant based on user-entered features. contributes novel insights field ML-based diabetic prediction, potentially aiding management Arabia.

Язык: Английский

Процитировано

10

Electronic health records based reinforcement learning for treatment optimizing DOI
Tianhao Li, Zhishun Wang, Wei Lü

и другие.

Information Systems, Год журнала: 2021, Номер 104, С. 101878 - 101878

Опубликована: Сен. 10, 2021

Язык: Английский

Процитировано

41

Diabetes prediction model using machine learning techniques DOI
Sandip Kumar Singh Modak, Vijay Kumar Jha

Multimedia Tools and Applications, Год журнала: 2023, Номер 83(13), С. 38523 - 38549

Опубликована: Окт. 5, 2023

Язык: Английский

Процитировано

14

An effective correlation-based data modeling framework for automatic diabetes prediction using machine and deep learning techniques DOI Creative Commons

Kiran Kumar Patro,

Allam Jaya Prakash,

Umamaheswararao Sanapala

и другие.

BMC Bioinformatics, Год журнала: 2023, Номер 24(1)

Опубликована: Окт. 2, 2023

The rising risk of diabetes, particularly in emerging countries, highlights the importance early detection. Manual prediction can be a challenging task, leading to need for automatic approaches. major challenge with biomedical datasets is data scarcity. Biomedical often difficult obtain large quantities, which limit ability train deep learning models effectively. noisy and inconsistent, make it accurate models. To overcome above-mentioned challenges, this work presents new framework modeling that based on correlation measures between features used process effectively predicting diabetes. standard, publicly available Pima Indians Medical Diabetes (PIMA) dataset utilized verify effectiveness proposed techniques. Experiments using PIMA showed method improved accuracy machine by an average 9%, convolutional neural network achieving 96.13%. Overall, study demonstrates strategy reliable

Язык: Английский

Процитировано

13

Prediction of Diabetes Using Data Mining and Machine Learning Algorithms: A Cross-Sectional Study DOI Creative Commons
Hassan Shojaee-Mend, Farnia Velayati, Batool Tayefi

и другие.

Healthcare Informatics Research, Год журнала: 2024, Номер 30(1), С. 73 - 82

Опубликована: Янв. 31, 2024

This study aimed to develop a model predict fasting blood glucose status using machine learning and data mining, since the early diagnosis treatment of diabetes can improve outcomes quality life. crosssectional analyzed from 3376 adults over 30 years old at 16 comprehensive health service centers in Tehran, Iran who participated screening program. The dataset was balanced random sampling synthetic minority over-sampling technique (SMOTE). split into training set (80%) test (20%). Shapley values were calculated select most important features. Noise analysis performed by adding Gaussian noise numerical features evaluate robustness feature importance. Five different algorithms, including CatBoost, forest, XGBoost, logistic regression, an artificial neural network, used dataset. Accuracy, sensitivity, specificity, accuracy, F1-score, area under curve model. Age, waist-to-hip ratio, body mass index, systolic pressure factors for predicting status. Though models achieved similar predictive ability, CatBoost slightly better overall with 0.737 (AUC). A gradient boosted decision tree accurately identified risk related diabetes. diabetes, respectively. support planning management prevention.

Язык: Английский

Процитировано

5