Evaluating Deep Learning Embedding Techniques for Code Smell Detection DOI

Praveen Singh Thakur,

Mahipal Jadeja, Satyendra Singh Chouhan

и другие.

Lecture notes in computer science, Год журнала: 2025, Номер unknown, С. 339 - 350

Опубликована: Янв. 1, 2025

Язык: Английский

Prediction of soil salinity parameters using machine learning models in an arid region of northwest China DOI
Chao Xiao,

Qingyuan Ji,

Junqing Chen

и другие.

Computers and Electronics in Agriculture, Год журнала: 2022, Номер 204, С. 107512 - 107512

Опубликована: Ноя. 26, 2022

Язык: Английский

Процитировано

43

A survey of multimodal hybrid deep learning for computer vision: Architectures, applications, trends, and challenges DOI
Khaled Bayoudh

Information Fusion, Год журнала: 2023, Номер 105, С. 102217 - 102217

Опубликована: Дек. 30, 2023

Язык: Английский

Процитировано

36

Ensemble machine learning for modeling greenhouse gas emissions at different time scales from irrigated paddy fields DOI
Zewei Jiang, Shihong Yang, Pete Smith

и другие.

Field Crops Research, Год журнала: 2023, Номер 292, С. 108821 - 108821

Опубликована: Янв. 23, 2023

Язык: Английский

Процитировано

32

ARIMA-AdaBoost hybrid approach for product quality prediction in advanced transformer manufacturing DOI
Chun-Hua Chien, Amy J.C. Trappey, Chien-Chih Wang

и другие.

Advanced Engineering Informatics, Год журнала: 2023, Номер 57, С. 102055 - 102055

Опубликована: Июнь 24, 2023

Язык: Английский

Процитировано

25

Channel based epilepsy seizure type detection from electroencephalography (EEG) signals with machine learning techniques DOI
Erdem Tuncer, Emine Doğru Bolat

Journal of Applied Biomedicine, Год журнала: 2022, Номер 42(2), С. 575 - 595

Опубликована: Апрель 1, 2022

Язык: Английский

Процитировано

35

A novel approach for software defect prediction using CNN and GRU based on SMOTE Tomek method DOI Creative Commons
Nasraldeen Alnor Adam Khleel, Károly Nehéz

Journal of Intelligent Information Systems, Год журнала: 2023, Номер 60(3), С. 673 - 707

Опубликована: Май 16, 2023

Abstract Software defect prediction (SDP) plays a vital role in enhancing the quality of software projects and reducing maintenance-based risks through ability to detect defective components. SDP refers using historical data construct relationship between metrics defects via diverse methodologies. Several models, such as machine learning (ML) deep (DL), have been developed adopted recognize module defects, many methodologies frameworks presented. Class imbalance is one most challenging problems these models face binary classification. However, When distribution classes imbalanced, accuracy may be high, but cannot instances minority class, leading weak classifications. So far, little research has done previous studies that address problem class SDP. In this study, sampling method introduced improve performance ML The proposed approach based on convolutional neural network (CNN) gated recurrent unit (GRU) combined with synthetic oversampling technique plus Tomek link (SMOTE Tomek) predict defects. To establish efficiency experiments conducted benchmark datasets obtained from PROMISE repository. experimental results compared evaluated terms accuracy, precision, recall, F-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR), mean square error (MSE). showed more effectively balanced than original datasets, an improvement up 19% for CNN model 24% GRU AUC. We our existing approaches several standard measures. comparison demonstrated significantly outperforms state-of-the-art datasets.

Язык: Английский

Процитировано

19

Predicting and refining acid modifications of biochar based on machine learning and bibliometric analysis: Specific surface area, average pore size, and total pore volume DOI
Fangzhou Zhao,

Lingyi Tang,

Wenjing Song

и другие.

The Science of The Total Environment, Год журнала: 2024, Номер 948, С. 174584 - 174584

Опубликована: Июль 6, 2024

Язык: Английский

Процитировано

8

Software defect prediction using a bidirectional LSTM network combined with oversampling techniques DOI Creative Commons
Nasraldeen Alnor Adam Khleel, Károly Nehéz

Cluster Computing, Год журнала: 2023, Номер 27(3), С. 3615 - 3638

Опубликована: Окт. 28, 2023

Abstract Software defects are a critical issue in software development that can lead to system failures and cause significant financial losses. Predicting is vital aspect of ensuring quality. This significantly impact both saving time reducing the overall cost testing. During defect prediction (SDP) process, automated tools attempt predict source codes based on metrics. Several SDP models have been proposed identify prevent before they occur. In recent years, recurrent neural network (RNN) techniques gained attention for their ability handle sequential data learn complex patterns. Still, these not always suitable predicting due problem imbalanced data. To deal with this problem, study aims combine bidirectional long short-term memory (Bi-LSTM) oversampling techniques. establish effectiveness efficiency model, experiments conducted benchmark datasets obtained from PROMISE repository. The experimental results compared evaluated terms accuracy, precision, recall, f-measure, Matthew’s correlation coefficient (MCC), area under ROC curve (AUC), precision-recall (AUCPR) mean square error (MSE). average accuracy model original balanced (using random SMOTE) was 88%, 94%, And 92%, respectively. showed Bi-LSTM improves by 6 4% datasets. F-measure were 51%, 43 41% demonstrated combining positively affects performance class distributions.

Язык: Английский

Процитировано

17

On the relative value of imbalanced learning for code smell detection DOI
Fuyang Li,

Kuan Zou,

Jacky Keung

и другие.

Software Practice and Experience, Год журнала: 2023, Номер 53(10), С. 1902 - 1927

Опубликована: Июнь 26, 2023

Summary Machine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers identify problematic patterns in code. However, previous researches have shown that the datasets commonly used train these models are heavily imbalanced. While some recent studies explored use of imbalanced learning techniques CSD, they only evaluated limited number thus their conclusions about most effective methods may biased inconclusive. To thoroughly evaluate effect machine we examine 31 with seven classifiers build CSD on four data sets. We employ evaluation metrics assess performance Wilcoxon signed‐rank test Cliff's . The results show (1) Not all significantly improve performance, but deep forest outperforms other (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not technique resampling (3) best‐performing top‐3 little time cost detection. Therefore, provide practical guidelines. First, researchers practitioners should select appropriate (e.g., forest) ameliorate class imbalance problem. In contrast, blind application could harmful. Then, better than selected preprocess

Язык: Английский

Процитировано

14

A survey on machine learning techniques applied to source code DOI Creative Commons
Tushar Sharma, Maria Kechagia, Stefanos Georgiou

и другие.

Journal of Systems and Software, Год журнала: 2023, Номер 209, С. 111934 - 111934

Опубликована: Дек. 19, 2023

The advancements in machine learning techniques have encouraged researchers to apply these a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such large number studies hinders the community from understanding current research landscape. This paper aims summarize knowledge applied for analysis. We review belonging twelve categories corresponding techniques, tools, datasets been solve them. To do so, we conducted an extensive literature search identified 494 studies. our observations findings with help Our suggest analysis is consistently increasing. synthesize commonly used steps overall workflow each task employed. identify comprehensive list available tools useable this context. Finally, discusses perceived challenges area, including availability standard datasets, reproducibility replicability, hardware resources. Editor's note: Open Science material was validated by Journal Systems Software Board.

Язык: Английский

Процитировано

14