Cited by CBReT: A Cluster-Based Resampling Technique for dealing with imbalanced data in code smell prediction

Prediction of soil salinity parameters using machine learning models in an arid region of northwest China DOI

Chao Xiao,

Qingyuan Ji,

Junqing Chen

et al.

Computers and Electronics in Agriculture, Journal Year: 2022, Volume and Issue: 204, P. 107512 - 107512

Published: Nov. 26, 2022

Language: Английский

Citations

A survey of multimodal hybrid deep learning for computer vision: Architectures, applications, trends, and challenges DOI

Khaled Bayoudh

Information Fusion, Journal Year: 2023, Volume and Issue: 105, P. 102217 - 102217

Published: Dec. 30, 2023

Language: Английский

Citations

Ensemble machine learning for modeling greenhouse gas emissions at different time scales from irrigated paddy fields DOI

Zewei Jiang, Shihong Yang, Pete Smith

et al.

Field Crops Research, Journal Year: 2023, Volume and Issue: 292, P. 108821 - 108821

Published: Jan. 23, 2023

Language: Английский

Citations

ARIMA-AdaBoost hybrid approach for product quality prediction in advanced transformer manufacturing DOI

Chun-Hua Chien, Amy J.C. Trappey, Chien-Chih Wang

et al.

Advanced Engineering Informatics, Journal Year: 2023, Volume and Issue: 57, P. 102055 - 102055

Published: June 24, 2023

Language: Английский

Citations

Channel based epilepsy seizure type detection from electroencephalography (EEG) signals with machine learning techniques DOI

Erdem Tuncer, Emine Doğru Bolat

Journal of Applied Biomedicine, Journal Year: 2022, Volume and Issue: 42(2), P. 575 - 595

Published: April 1, 2022

Language: Английский

Citations

A novel approach for software defect prediction using CNN and GRU based on SMOTE Tomek method DOI

Nasraldeen Alnor Adam Khleel, Károly Nehéz

Journal of Intelligent Information Systems, Journal Year: 2023, Volume and Issue: 60(3), P. 673 - 707

Published: May 16, 2023

Abstract Software defect prediction (SDP) plays a vital role in enhancing the quality of software projects and reducing maintenance-based risks through ability to detect defective components. SDP refers using historical data construct relationship between metrics defects via diverse methodologies. Several models, such as machine learning (ML) deep (DL), have been developed adopted recognize module defects, many methodologies frameworks presented. Class imbalance is one most challenging problems these models face binary classification. However, When distribution classes imbalanced, accuracy may be high, but cannot instances minority class, leading weak classifications. So far, little research has done previous studies that address problem class SDP. In this study, sampling method introduced improve performance ML The proposed approach based on convolutional neural network (CNN) gated recurrent unit (GRU) combined with synthetic oversampling technique plus Tomek link (SMOTE Tomek) predict defects. To establish efficiency experiments conducted benchmark datasets obtained from PROMISE repository. experimental results compared evaluated terms accuracy, precision, recall, F-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR), mean square error (MSE). showed more effectively balanced than original datasets, an improvement up 19% for CNN model 24% GRU AUC. We our existing approaches several standard measures. comparison demonstrated significantly outperforms state-of-the-art datasets.

Language: Английский

Citations

Predicting and refining acid modifications of biochar based on machine learning and bibliometric analysis: Specific surface area, average pore size, and total pore volume DOI

Fangzhou Zhao,

Lingyi Tang,

Wenjing Song

et al.

The Science of The Total Environment, Journal Year: 2024, Volume and Issue: 948, P. 174584 - 174584

Published: July 6, 2024

Language: Английский

Citations

Software defect prediction using a bidirectional LSTM network combined with oversampling techniques DOI

Nasraldeen Alnor Adam Khleel, Károly Nehéz

Cluster Computing, Journal Year: 2023, Volume and Issue: 27(3), P. 3615 - 3638

Published: Oct. 28, 2023

Abstract Software defects are a critical issue in software development that can lead to system failures and cause significant financial losses. Predicting is vital aspect of ensuring quality. This significantly impact both saving time reducing the overall cost testing. During defect prediction (SDP) process, automated tools attempt predict source codes based on metrics. Several SDP models have been proposed identify prevent before they occur. In recent years, recurrent neural network (RNN) techniques gained attention for their ability handle sequential data learn complex patterns. Still, these not always suitable predicting due problem imbalanced data. To deal with this problem, study aims combine bidirectional long short-term memory (Bi-LSTM) oversampling techniques. establish effectiveness efficiency model, experiments conducted benchmark datasets obtained from PROMISE repository. The experimental results compared evaluated terms accuracy, precision, recall, f-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR) mean square error (MSE). average accuracy model original balanced (using random SMOTE) was 88%, 94%, And 92%, respectively. showed Bi-LSTM improves by 6 4% datasets. F-measure were 51%, 43 41% demonstrated combining positively affects performance class distributions.

Language: Английский

Citations

On the relative value of imbalanced learning for code smell detection DOI

Fuyang Li,

Kuan Zou,

Jacky Keung

et al.

Software Practice and Experience, Journal Year: 2023, Volume and Issue: 53(10), P. 1902 - 1927

Published: June 26, 2023

Summary Machine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers identify problematic patterns in code. However, previous researches have shown that the datasets commonly used train these models are heavily imbalanced. While some recent studies explored use of imbalanced learning techniques CSD, they only evaluated limited number thus their conclusions about most effective methods may biased inconclusive. To thoroughly evaluate effect machine we examine 31 with seven classifiers build CSD on four data sets. We employ evaluation metrics assess performance Wilcoxon signed‐rank test Cliff's . The results show (1) Not all significantly improve performance, but deep forest outperforms other (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not technique resampling (3) best‐performing top‐3 little time cost detection. Therefore, provide practical guidelines. First, researchers practitioners should select appropriate (e.g., forest) ameliorate class imbalance problem. In contrast, blind application could harmful. Then, better than selected preprocess

Language: Английский

Citations

The inversion of arid-coastal cultivated soil salinity using explainable machine learning and Sentinel-2 DOI

Pingping Jia, Junhua Zhang,

Yanning Liang

et al.

Ecological Indicators, Journal Year: 2024, Volume and Issue: 166, P. 112364 - 112364

Published: July 29, 2024

Language: Английский

Citations