Code smell severity classification at class and method level with a single manually labeled imbalanced dataset DOI
Fábio do Rosario Santos, Júlio César Duarte, Ricardo Choren Noya

и другие.

Опубликована: Сен. 30, 2024

Detecting code smells through machine learning (ML) poses challenges due to its unbalanced nature and potential interpretation bias. While previous studies focused on severity tended categorize smell’s specific types, this research aims detect classify smell in a single dataset containing instances of four distinct types: God-class, Data-Class, Feature-Envy, Long-Method. This study also explores the impact applying data scaling, feature selection techniques, ensemble methods enhance ML models for purpose above. The evaluation two combined reveals that using standardization methods, Chi-square outperforms result other combinations, achieving 81.04% 81.41% accuracy XGBoost CatBoost models. Additionally, algorithm attains highest at 80.67%, even without preprocessing. Comparatively with state-of-the-art, results obtained, an 85%, by proposed approach detecting are promising suggest improvements approaches techniques effectiveness reliability real-world scenarios.

Язык: Английский

A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique DOI Creative Commons

Rajwant Singh Rao,

Seema Dewangan,

Alok Mishra

и другие.

Scientific Reports, Год журнала: 2023, Номер 13(1)

Опубликована: Сен. 27, 2023

Abstract Detecting code smells may be highly helpful for reducing maintenance costs and raising source quality. Code facilitate developers or researchers to understand several types of design flaws. with high severity can cause significant problems the software challenges system's maintainability. It is quite essential assess detected in software, as it prioritizes refactoring efforts. The class imbalance problem also further enhances difficulties smell detection. In this study, four datasets (Data class, God Feature envy, Long method) are selected detect severity. work, an effort made address issue imbalance, which, Synthetic Minority Oversampling Technique (SMOTE) balancing technique applied. Each dataset's relevant features chosen using a feature selection based on principal component analysis. determined five machine learning techniques: K-nearest neighbor, Random forest, Decision tree, Multi-layer Perceptron, Logistic Regression. This study obtained 0.99 accuracy score forest tree approach method smell. model's performance compared its three other measurements (Precision, Recall, F-measure) estimate classification models. impact presented without applying SMOTE. results promising beneficial paving way studies area.

Язык: Английский

Процитировано

20

An Evaluation of Multi-Label Classification Approaches for Method-Level Code Smells Detection DOI Creative Commons
Pravin Singh Yadav,

Rajwant Singh Rao,

Alok Mishra

и другие.

IEEE Access, Год журнала: 2024, Номер 12, С. 53664 - 53676

Опубликована: Янв. 1, 2024

(1) Background: Code smell is the most popular and reliable method for detecting potential errors in code. In real-world circumstances, a single source code may have multiple smells. Multi-label detection research study. However, limited studies are available on it, there need standardized classifier reliably identifying various multi-label smells that belong to method-level category. The primary goal of this study develop rule-based (2) Methods: Binary Relevance, Label Powerset, Classifier Chain methods utilized with tree based single-label algorithms, including some ensemble algorithms paper. chi-square feature selection technique applied select relevant features. proposed model trained using 10-fold cross-validation, Random Search cross-validation parameter tuning, different performance measures used evaluate model. (3) Results: achieves 99.54% best jaccard accuracy Decision Tree. Tree incorporating outperforms alternative approaches classification. Single-label classifiers produced better results after considering correlation factor. (4) Conclusion: This will facilitate scientists programmers by providing systematic software projects saving time effort during reviews problems simultaneously. After smell, can create more organized, easier-to-understand, trustworthy programs.

Язык: Английский

Процитировано

6

Ensemble methods with feature selection and data balancing for improved code smells classification performance DOI Creative Commons
Pravin Singh Yadav,

Rajwant Singh Rao,

Alok Mishra

и другие.

Engineering Applications of Artificial Intelligence, Год журнала: 2024, Номер 139, С. 109527 - 109527

Опубликована: Окт. 28, 2024

Язык: Английский

Процитировано

4

Alleviating class imbalance in Feature Envy prediction: An oversampling technique based on code entity attributes DOI
Jiamin Guo, Yangyang Zhao, Tao Zheng

и другие.

Information and Software Technology, Год журнала: 2025, Номер 180, С. 107673 - 107673

Опубликована: Янв. 15, 2025

Язык: Английский

Процитировано

0

Adaptive Ensemble Learning Model-Based Binary White Shark Optimizer for Software Defect Classification DOI Creative Commons

Jameel Saraireh,

Mary Agoyi,

Sofian Kassaymeh

и другие.

International Journal of Computational Intelligence Systems, Год журнала: 2025, Номер 18(1)

Опубликована: Янв. 23, 2025

Язык: Английский

Процитировано

0

DeepCSS: severity classification for code smell based on deep learning DOI
Yang Zhang, Chunhui Zhang, Kun Zheng

и другие.

Empirical Software Engineering, Год журнала: 2025, Номер 30(3)

Опубликована: Март 25, 2025

Язык: Английский

Процитировано

0

Data preprocessing for machine learning based code smell detection: A systematic literature review DOI
Fábio do Rosario Santos, Ricardo Choren Noya

Information and Software Technology, Год журнала: 2025, Номер unknown, С. 107752 - 107752

Опубликована: Апрель 1, 2025

Язык: Английский

Процитировано

0

The impact of feature selection and feature reduction techniques for code smell detection: A comprehensive empirical study DOI
Zexian Zhang, Lin Zhu, Shuang Yin

и другие.

Automated Software Engineering, Год журнала: 2025, Номер 32(2)

Опубликована: Май 16, 2025

Язык: Английский

Процитировано

0

CBReT: A Cluster-Based Resampling Technique for dealing with imbalanced data in code smell prediction DOI

Praveen Singh Thakur,

Mahipal Jadeja, Satyendra Singh Chouhan

и другие.

Knowledge-Based Systems, Год журнала: 2024, Номер 286, С. 111390 - 111390

Опубликована: Янв. 21, 2024

Язык: Английский

Процитировано

3

Revisiting Code Smell Severity Prioritization using learning to rank techniques DOI
Lei Liu, Guancheng Lin, Lin Zhu

и другие.

Expert Systems with Applications, Год журнала: 2024, Номер 249, С. 123483 - 123483

Опубликована: Фев. 14, 2024

Язык: Английский

Процитировано

2