CBReT: A Cluster-Based Resampling Technique for dealing with imbalanced data in code smell prediction DOI

Praveen Singh Thakur,

Mahipal Jadeja, Satyendra Singh Chouhan

et al.

Knowledge-Based Systems, Journal Year: 2024, Volume and Issue: 286, P. 111390 - 111390

Published: Jan. 21, 2024

Language: Английский

Prediction of soil salinity parameters using machine learning models in an arid region of northwest China DOI
Chao Xiao,

Qingyuan Ji,

Junqing Chen

et al.

Computers and Electronics in Agriculture, Journal Year: 2022, Volume and Issue: 204, P. 107512 - 107512

Published: Nov. 26, 2022

Language: Английский

Citations

42

A survey of multimodal hybrid deep learning for computer vision: Architectures, applications, trends, and challenges DOI
Khaled Bayoudh

Information Fusion, Journal Year: 2023, Volume and Issue: 105, P. 102217 - 102217

Published: Dec. 30, 2023

Language: Английский

Citations

35

Ensemble machine learning for modeling greenhouse gas emissions at different time scales from irrigated paddy fields DOI
Zewei Jiang, Shihong Yang, Pete Smith

et al.

Field Crops Research, Journal Year: 2023, Volume and Issue: 292, P. 108821 - 108821

Published: Jan. 23, 2023

Language: Английский

Citations

30

ARIMA-AdaBoost hybrid approach for product quality prediction in advanced transformer manufacturing DOI
Chun-Hua Chien, Amy J.C. Trappey, Chien-Chih Wang

et al.

Advanced Engineering Informatics, Journal Year: 2023, Volume and Issue: 57, P. 102055 - 102055

Published: June 24, 2023

Language: Английский

Citations

24

Channel based epilepsy seizure type detection from electroencephalography (EEG) signals with machine learning techniques DOI
Erdem Tuncer, Emine Doğru Bolat

Journal of Applied Biomedicine, Journal Year: 2022, Volume and Issue: 42(2), P. 575 - 595

Published: April 1, 2022

Language: Английский

Citations

35

A novel approach for software defect prediction using CNN and GRU based on SMOTE Tomek method DOI Creative Commons
Nasraldeen Alnor Adam Khleel, Károly Nehéz

Journal of Intelligent Information Systems, Journal Year: 2023, Volume and Issue: 60(3), P. 673 - 707

Published: May 16, 2023

Abstract Software defect prediction (SDP) plays a vital role in enhancing the quality of software projects and reducing maintenance-based risks through ability to detect defective components. SDP refers using historical data construct relationship between metrics defects via diverse methodologies. Several models, such as machine learning (ML) deep (DL), have been developed adopted recognize module defects, many methodologies frameworks presented. Class imbalance is one most challenging problems these models face binary classification. However, When distribution classes imbalanced, accuracy may be high, but cannot instances minority class, leading weak classifications. So far, little research has done previous studies that address problem class SDP. In this study, sampling method introduced improve performance ML The proposed approach based on convolutional neural network (CNN) gated recurrent unit (GRU) combined with synthetic oversampling technique plus Tomek link (SMOTE Tomek) predict defects. To establish efficiency experiments conducted benchmark datasets obtained from PROMISE repository. experimental results compared evaluated terms accuracy, precision, recall, F-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR), mean square error (MSE). showed more effectively balanced than original datasets, an improvement up 19% for CNN model 24% GRU AUC. We our existing approaches several standard measures. comparison demonstrated significantly outperforms state-of-the-art datasets.

Language: Английский

Citations

19

Predicting and refining acid modifications of biochar based on machine learning and bibliometric analysis: Specific surface area, average pore size, and total pore volume DOI
Fangzhou Zhao,

Lingyi Tang,

Wenjing Song

et al.

The Science of The Total Environment, Journal Year: 2024, Volume and Issue: 948, P. 174584 - 174584

Published: July 6, 2024

Language: Английский

Citations

8

Software defect prediction using a bidirectional LSTM network combined with oversampling techniques DOI Creative Commons
Nasraldeen Alnor Adam Khleel, Károly Nehéz

Cluster Computing, Journal Year: 2023, Volume and Issue: 27(3), P. 3615 - 3638

Published: Oct. 28, 2023

Abstract Software defects are a critical issue in software development that can lead to system failures and cause significant financial losses. Predicting is vital aspect of ensuring quality. This significantly impact both saving time reducing the overall cost testing. During defect prediction (SDP) process, automated tools attempt predict source codes based on metrics. Several SDP models have been proposed identify prevent before they occur. In recent years, recurrent neural network (RNN) techniques gained attention for their ability handle sequential data learn complex patterns. Still, these not always suitable predicting due problem imbalanced data. To deal with this problem, study aims combine bidirectional long short-term memory (Bi-LSTM) oversampling techniques. establish effectiveness efficiency model, experiments conducted benchmark datasets obtained from PROMISE repository. The experimental results compared evaluated terms accuracy, precision, recall, f-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR) mean square error (MSE). average accuracy model original balanced (using random SMOTE) was 88%, 94%, And 92%, respectively. showed Bi-LSTM improves by 6 4% datasets. F-measure were 51%, 43 41% demonstrated combining positively affects performance class distributions.

Language: Английский

Citations

15

On the relative value of imbalanced learning for code smell detection DOI
Fuyang Li,

Kuan Zou,

Jacky Keung

et al.

Software Practice and Experience, Journal Year: 2023, Volume and Issue: 53(10), P. 1902 - 1927

Published: June 26, 2023

Summary Machine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers identify problematic patterns in code. However, previous researches have shown that the datasets commonly used train these models are heavily imbalanced. While some recent studies explored use of imbalanced learning techniques CSD, they only evaluated limited number thus their conclusions about most effective methods may biased inconclusive. To thoroughly evaluate effect machine we examine 31 with seven classifiers build CSD on four data sets. We employ evaluation metrics assess performance Wilcoxon signed‐rank test Cliff's . The results show (1) Not all significantly improve performance, but deep forest outperforms other (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not technique resampling (3) best‐performing top‐3 little time cost detection. Therefore, provide practical guidelines. First, researchers practitioners should select appropriate (e.g., forest) ameliorate class imbalance problem. In contrast, blind application could harmful. Then, better than selected preprocess

Language: Английский

Citations

14

The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2 DOI Creative Commons
Pingping Jia, Junhua Zhang,

Yanning Liang

et al.

Ecological Indicators, Journal Year: 2024, Volume and Issue: 166, P. 112364 - 112364

Published: July 29, 2024

Language: Английский

Citations

6