Cited by DeepLog: Deep-Learning-Based Log Recommendation

Large language models (LLMs): survey, technical frameworks, and future challenges DOI

Pranjal Kumar

Artificial Intelligence Review, Journal Year: 2024, Volume and Issue: 57(10)

Published: Aug. 18, 2024

Artificial intelligence (AI) has significantly impacted various fields. Large language models (LLMs) like GPT-4, BARD, PaLM, Megatron-Turing NLG, Jurassic-1 Jumbo etc., have contributed to our understanding and application of AI in these domains, along with natural processing (NLP) techniques. This work provides a comprehensive overview LLMs the context modeling, word embeddings, deep learning. It examines diverse fields including text generation, vision-language models, personalized learning, biomedicine, code generation. The paper offers detailed introduction background on LLMs, facilitating clear their fundamental ideas concepts. Key modeling architectures are also discussed, alongside survey recent works employing LLM methods for downstream tasks across different domains. Additionally, it assesses limitations current approaches highlights need new methodologies potential directions significant advancements this field.

Language: Английский

Citations

A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique DOI

Rajwant Singh Rao,

Seema Dewangan,

Alok Mishra

et al.

Scientific Reports, Journal Year: 2023, Volume and Issue: 13(1)

Published: Sept. 27, 2023

Abstract Detecting code smells may be highly helpful for reducing maintenance costs and raising source quality. Code facilitate developers or researchers to understand several types of design flaws. with high severity can cause significant problems the software challenges system's maintainability. It is quite essential assess detected in software, as it prioritizes refactoring efforts. The class imbalance problem also further enhances difficulties smell detection. In this study, four datasets (Data class, God Feature envy, Long method) are selected detect severity. work, an effort made address issue imbalance, which, Synthetic Minority Oversampling Technique (SMOTE) balancing technique applied. Each dataset's relevant features chosen using a feature selection based on principal component analysis. determined five machine learning techniques: K-nearest neighbor, Random forest, Decision tree, Multi-layer Perceptron, Logistic Regression. This study obtained 0.99 accuracy score forest tree approach method smell. model's performance compared its three other measurements (Precision, Recall, F-measure) estimate classification models. impact presented without applying SMOTE. results promising beneficial paving way studies area.

Language: Английский

Citations

Deep learning-based solution for smart contract vulnerabilities detection DOI

Xueyan Tang,

Yuying Du,

Alan Lai

et al.

Scientific Reports, Journal Year: 2023, Volume and Issue: 13(1)

Published: Nov. 16, 2023

This paper aims to explore the application of deep learning in smart contract vulnerabilities detection. Smart contracts are an essential part blockchain technology and crucial for developing decentralized applications. However, can cause financial losses system crashes. Static analysis tools frequently used detect contracts, but they often result false positives negatives because their high reliance on predefined rules lack semantic capabilities. Furthermore, these quickly become obsolete fail adapt or generalize new data. In contrast, methods do not require detection learn features during training process. this paper, we introduce a solution called Lightning Cat which is based techniques. We train three models detecting contract: Optimized-CodeBERT, Optimized-LSTM, Optimized-CNN. Experimental results show that, propose, Optimized-CodeBERT model surpasses other methods, achieving f1-score 93.53%. To precisely extract vulnerability features, acquire segments vulnerable code functions retain critical features. Using CodeBERT pre-training data preprocessing, could capture syntax semantics more accurately. demonstrate feasibility our proposed solution, evaluate its performance using SolidiFI-benchmark dataset, consists 9369 injected with from seven different types.

Language: Английский

Citations

A Systematic Literature Review on the Code Smells Datasets and Validation Mechanisms DOI

Morteza Zakeri‐Nasrabadi, Saeed Parsa, Ehsan Esmaili

et al.

ACM Computing Surveys, Journal Year: 2023, Volume and Issue: 55(13s), P. 1 - 48

Published: May 13, 2023

The accuracy reported for code smell-detecting tools varies depending on the dataset used to evaluate tools. Our survey of 45 existing datasets reveals that adequacy a detecting smells highly depends relevant properties such as size, severity level, project types, number each type smell, smells, and ratio smelly non-smelly samples in dataset. Most support God Class, Long Method, Feature Envy while six Fowler Beck's catalog are not supported by any datasets. We conclude suffer from imbalanced samples, lack supporting restriction Java language.

Language: Английский

Citations

On the relative value of imbalanced learning for code smell detection DOI

Fuyang Li,

Kuan Zou,

Jacky Keung

et al.

Software Practice and Experience, Journal Year: 2023, Volume and Issue: 53(10), P. 1902 - 1927

Published: June 26, 2023

Summary Machine learning‐based code smell detection (CSD) has been demonstrated to be a valuable approach for improving software quality and enabling developers identify problematic patterns in code. However, previous researches have shown that the datasets commonly used train these models are heavily imbalanced. While some recent studies explored use of imbalanced learning techniques CSD, they only evaluated limited number thus their conclusions about most effective methods may biased inconclusive. To thoroughly evaluate effect machine we examine 31 with seven classifiers build CSD on four data sets. We employ evaluation metrics assess performance Wilcoxon signed‐rank test Cliff's . The results show (1) Not all significantly improve performance, but deep forest outperforms other (2) SMOTE (Synthetic Minority Over‐sampling TEchnique) is not technique resampling (3) best‐performing top‐3 little time cost detection. Therefore, provide practical guidelines. First, researchers practitioners should select appropriate (e.g., forest) ameliorate class imbalance problem. In contrast, blind application could harmful. Then, better than selected preprocess

Language: Английский

Citations

Alleviating class imbalance in Feature Envy prediction: An oversampling technique based on code entity attributes DOI

Jiamin Guo, Yangyang Zhao, Tao Zheng

et al.

Information and Software Technology, Journal Year: 2025, Volume and Issue: 180, P. 107673 - 107673

Published: Jan. 15, 2025

Language: Английский

Citations

A Semisupervised Learning Approach for Code Smell Detection DOI

Ishita Kheria, Dhruv Gada, Ruhina Karani

et al.

SN Computer Science, Journal Year: 2025, Volume and Issue: 6(2)

Published: Feb. 6, 2025

Language: Английский

Citations

Predicting Software Reliability Through Machine Learning Analysis of Code Smells DOI

Aakanshi Gupta,

Nidhi Mishra, Ashok Kumar Yadav

et al.

Lecture notes in electrical engineering, Journal Year: 2025, Volume and Issue: unknown, P. 71 - 81

Published: Jan. 1, 2025

Language: Английский

Citations

EnseSmells: Deep ensemble and programming language models for automated code smells detection DOI

Anh Ho, Anh M. T. Bui, Phuong T. Nguyen

et al.

Journal of Systems and Software, Journal Year: 2025, Volume and Issue: unknown, P. 112375 - 112375

Published: Feb. 1, 2025

Citations

Graph Neural Network-based Long Method and Blob Code Smell Detection DOI

Minnan Zhang,

Jingdong Jia, Luiz Fernando Capretz

et al.

Science of Computer Programming, Journal Year: 2025, Volume and Issue: unknown, P. 103284 - 103284

Published: Feb. 1, 2025

Language: Английский

Citations