Knowledge-Based Systems, Journal Year: 2024, Volume and Issue: 286, P. 111390 - 111390
Published: Jan. 21, 2024
Language: Английский
Knowledge-Based Systems, Journal Year: 2024, Volume and Issue: 286, P. 111390 - 111390
Published: Jan. 21, 2024
Language: Английский
Computers and Electronics in Agriculture, Journal Year: 2022, Volume and Issue: 204, P. 107512 - 107512
Published: Nov. 26, 2022
Language: Английский
Citations
42Information Fusion, Journal Year: 2023, Volume and Issue: 105, P. 102217 - 102217
Published: Dec. 30, 2023
Language: Английский
Citations
35Field Crops Research, Journal Year: 2023, Volume and Issue: 292, P. 108821 - 108821
Published: Jan. 23, 2023
Language: Английский
Citations
30Advanced Engineering Informatics, Journal Year: 2023, Volume and Issue: 57, P. 102055 - 102055
Published: June 24, 2023
Language: Английский
Citations
24Journal of Applied Biomedicine, Journal Year: 2022, Volume and Issue: 42(2), P. 575 - 595
Published: April 1, 2022
Language: Английский
Citations
35Journal of Intelligent Information Systems, Journal Year: 2023, Volume and Issue: 60(3), P. 673 - 707
Published: May 16, 2023
Abstract Software defect prediction (SDP) plays a vital role in enhancing the quality of software projects and reducing maintenance-based risks through ability to detect defective components. SDP refers using historical data construct relationship between metrics defects via diverse methodologies. Several models, such as machine learning (ML) deep (DL), have been developed adopted recognize module defects, many methodologies frameworks presented. Class imbalance is one most challenging problems these models face binary classification. However, When distribution classes imbalanced, accuracy may be high, but cannot instances minority class, leading weak classifications. So far, little research has done previous studies that address problem class SDP. In this study, sampling method introduced improve performance ML The proposed approach based on convolutional neural network (CNN) gated recurrent unit (GRU) combined with synthetic oversampling technique plus Tomek link (SMOTE Tomek) predict defects. To establish efficiency experiments conducted benchmark datasets obtained from PROMISE repository. experimental results compared evaluated terms accuracy, precision, recall, F-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR), mean square error (MSE). showed more effectively balanced than original datasets, an improvement up 19% for CNN model 24% GRU AUC. We our existing approaches several standard measures. comparison demonstrated significantly outperforms state-of-the-art datasets.
Language: Английский
Citations
19The Science of The Total Environment, Journal Year: 2024, Volume and Issue: 948, P. 174584 - 174584
Published: July 6, 2024
Language: Английский
Citations
8Cluster Computing, Journal Year: 2023, Volume and Issue: 27(3), P. 3615 - 3638
Published: Oct. 28, 2023
Abstract Software defects are a critical issue in software development that can lead to system failures and cause significant financial losses. Predicting is vital aspect of ensuring quality. This significantly impact both saving time reducing the overall cost testing. During defect prediction (SDP) process, automated tools attempt predict source codes based on metrics. Several SDP models have been proposed identify prevent before they occur. In recent years, recurrent neural network (RNN) techniques gained attention for their ability handle sequential data learn complex patterns. Still, these not always suitable predicting due problem imbalanced data. To deal with this problem, study aims combine bidirectional long short-term memory (Bi-LSTM) oversampling techniques. establish effectiveness efficiency model, experiments conducted benchmark datasets obtained from PROMISE repository. The experimental results compared evaluated terms accuracy, precision, recall, f-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR) mean square error (MSE). average accuracy model original balanced (using random SMOTE) was 88%, 94%, And 92%, respectively. showed Bi-LSTM improves by 6 4% datasets. F-measure were 51%, 43 41% demonstrated combining positively affects performance class distributions.
Language: Английский
Citations
15Software Practice and Experience, Journal Year: 2023, Volume and Issue: 53(10), P. 1902 - 1927
Published: June 26, 2023
Summary Machine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers identify problematic patterns in code. However, previous researches have shown that the datasets commonly used train these models are heavily imbalanced. While some recent studies explored use of imbalanced learning techniques CSD, they only evaluated limited number thus their conclusions about most effective methods may biased inconclusive. To thoroughly evaluate effect machine we examine 31 with seven classifiers build CSD on four data sets. We employ evaluation metrics assess performance Wilcoxon signed‐rank test Cliff's . The results show (1) Not all significantly improve performance, but deep forest outperforms other (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not technique resampling (3) best‐performing top‐3 little time cost detection. Therefore, provide practical guidelines. First, researchers practitioners should select appropriate (e.g., forest) ameliorate class imbalance problem. In contrast, blind application could harmful. Then, better than selected preprocess
Language: Английский
Citations
14Ecological Indicators, Journal Year: 2024, Volume and Issue: 166, P. 112364 - 112364
Published: July 29, 2024
Language: Английский
Citations
6