Cited by CAREER: Context-Aware API Recognition with Data Augmentation for API Knowledge Extraction

A Survey of Learning-based Automated Program Repair DOI

Quanjun Zhang, Chunrong Fang, Yuxiang Ma

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2023, Volume and Issue: 33(2), P. 1 - 69

Published: Nov. 6, 2023

Automated program repair (APR) aims to fix software bugs automatically and plays a crucial role in development maintenance. With the recent advances deep learning (DL), an increasing number of APR techniques have been proposed leverage neural networks learn bug-fixing patterns from massive open-source code repositories. Such learning-based usually treat as machine translation (NMT) task, where buggy snippets (i.e., source language) are translated into fixed target automatically. Benefiting powerful capability DL hidden relationships previous datasets, achieved remarkable performance. In this article, we provide systematic survey summarize current state-of-the-art research community. We illustrate general workflow detail components, including fault localization, patch generation, ranking, validation, correctness phases. then discuss widely adopted datasets evaluation metrics outline existing empirical studies. several critical aspects techniques, such domains, industrial deployment, open science issue. highlight practical guidelines on applying for future studies, exploring explainable generation utilizing features. Overall, our article can help researchers gain comprehensive understanding about achievements promote application these techniques. Our artifacts publicly available at repository: https://github.com/iSEngLab/AwesomeLearningAPR .

Language: Английский

Citations

CCT5: A Code-Change-Oriented Pre-trained Model DOI

Bo Lin, Shangwen Wang, Zhongxin Liu

et al.

Published: Nov. 30, 2023

Software is constantly changing, requiring developers to perform several derived tasks in a timely manner, such as writing description for the intention of code change, or identifying defect-prone changes. Considering that cost dealing with these can account large proportion (typically around 70 percent) total development expenditure, automating processes will significantly lighten burdens developers. To achieve target, existing approaches mainly rely on training deep learning models from scratch fine-tuning pre-trained tasks, both which have weaknesses. Specifically, former uses comparatively small-scale labelled data training, making it difficult learn and exploit domain knowledge programming language hidden large-amount unlabelled wild; latter hard fully leverage learned model, are designed encode single snippet rather than change (the difference between two snippets). We propose pre-train model specially changes better support software maintenance. this end, we first collect large-scale dataset containing 1.5M+ pairwise commit messages. Based data, curate five different pre-training, equip diverse about fine-tune CCT5, three widely-studied incurred by specific review process. Results show CCT5 outperforms conventional tasks.

Language: Английский

Citations

CCRep: Learning Code Change Representations via Pre-Trained Code Model and Query Back DOI

Zhongxin Liu, Zhijie Tang, Xin Xia

et al.

Published: May 1, 2023

Representing code changes as numeric feature vectors, i.e., change representations, is usually an essential step to automate many software engineering tasks related changes, e.g., commit message generation and just-in-time defect prediction. Intuitively, the quality of representations crucial for effectiveness automated approaches. Prior work on designs evaluates representation approaches a specific task, little has investigated encoders that can be used jointly trained various tasks. To fill this gap, proposes novel Code Change Representation learning approach named CCRep, which learn encode vectors diverse downstream Specifically, CCRep regards combination its before-change after-change code, leverages pre-trained model obtain high-quality contextual embeddings uses mechanism query back extract changed fragments make them explicitly interact with whole change. evaluate demonstrate applicability code-change-related tasks, we apply it three tasks: generation, patch correctness assessment, Experimental results show outperforms state-of-the-art techniques each task.

Language: Английский

Citations

APPT: Boosting Automated Patch Correctness Prediction via Fine-Tuning Pre-Trained Models DOI

Quanjun Zhang, Chunrong Fang, Weisong Sun

et al.

IEEE Transactions on Software Engineering, Journal Year: 2024, Volume and Issue: 50(3), P. 474 - 494

Published: Jan. 17, 2024

Automated program repair (APR) aims to fix software bugs automatically without human debugging efforts and plays a crucial role in development maintenance. Despite the recent significant progress number of fixed bugs, APR is still challenged by long-standing overfitting problem (i.e., generated patch plausible but overfitting). Various techniques have thus been proposed address problem. Recently, researchers employed BERT extract code features, which are then used train classifier for correctness prediction, indicating potential such pre-trained models reasoning about correctness. However, restricted feature extraction training benefiting from process, potentially generating sub-optimal vector representations patched snippets. In this paper, we propose APPT, model-based automated assessment technique both pre-training fine-tuning. APPT adopts model as encoder stack, followed an LSTM stack deep learning classifier. More importantly, fine-tuned conjunction with other components whole pipeline fully adapt it specifically Although our idea general can be built on various existing models, implemented based model. We conduct extensive experiment 1,183 Defects4J patches experimental results show that achieves prediction accuracy 79.7% recall 83.2%, outperforming state-of-the-art CACHE 4.3% 6.7%. Our additional investigation 49,694 real-world shows optimum performance (exceeding 99% five common metrics assessing classification techniques) compared representation techniques. further investigate impact each component find they all positively contribute e.g., fine-tuning process increase F1-score 10.22% 4.11%, respectively. also prove adopting advanced provide substantial advancement (e.g., GraphCodeBERT-based improves BERT-based 2.8% 3.3% precision AUC, respectively), highlighting generalizability APPT. Overall, study highlights promising future assess reduce manual inspection effort experts when deploying tools practice.

Language: Английский

Citations

A survey on machine learning techniques applied to source code DOI

Tushar Sharma, Maria Kechagia, Stefanos Georgiou

et al.

Journal of Systems and Software, Journal Year: 2023, Volume and Issue: 209, P. 111934 - 111934

Published: Dec. 19, 2023

The advancements in machine learning techniques have encouraged researchers to apply these a myriad of software engineering tasks that use source code analysis, such as testing and vulnerability detection. Such large number studies hinders the community from understanding current research landscape. This paper aims summarize knowledge applied for analysis. We review belonging twelve categories corresponding techniques, tools, datasets been solve them. To do so, we conducted an extensive literature search identified 494 studies. our observations findings with help Our suggest analysis is consistently increasing. synthesize commonly used steps overall workflow each task employed. identify comprehensive list available tools useable this context. Finally, discusses perceived challenges area, including availability standard datasets, reproducibility replicability, hardware resources. Editor's note: Open Science material was validated by Journal Systems Software Board.

Language: Английский

Citations

Research on WebAssembly Runtimes: A Survey DOI

Yixuan Zhang, Mugeng Liu, Haoyu Wang

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 23, 2025

WebAssembly (abbreviated as Wasm) was initially introduced for the Web and quickly extended its reach into various domains beyond Web. To create Wasm applications, developers can compile high-level programming languages binaries or manually write textual format of translate it by toolchain. Regardless whether is utilized within outside Web, execution supported runtime. Such a runtime provides secure, memory-efficient, sandboxed environment to execute binaries. This paper comprehensive survey research on runtimes with 103 collected papers related following traditional systematic literature review process. It characterizes existing studies from two different angles, including internal (Wasm design, testing, analysis) external (applying domains). also proposes future directions about runtimes.

Language: Английский

Citations

When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair DOI

Wenqiang Luo, Jacky Keung, Boyang Yang

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2025, Volume and Issue: unknown

Published: May 1, 2025

Software systems have been evolving rapidly and inevitably introducing bugs at an increasing rate, leading to significant maintenance costs. While large language models (LLMs) demonstrated remarkable potential in enhancing software development practices, particularly automated program repair (APR), they rely heavily on high-quality code repositories. Most repositories are proprietary assets that capture the diversity nuances of real-world industry which public datasets cannot fully represent. However, obtaining such data from various industries is hindered by privacy concerns, as companies reluctant share their codebases. There has also no in-depth investigation collaborative learning private decentralized while preserving for repair. To address gap, we investigate federated a privacy-preserving method fine-tuning LLMs boost maintenance. We use industrial dataset TutorCode EvalRepair-Java benchmark evaluation, assess whether enhances then further explore how heterogeneity (i.e., variations coding style, complexity, embedding) different algorithms affect bug fixing provide practical implications collaboration. Our evaluation reveals can significantly enhance repair, achieving increases up 16.67% Top@10 18.44% Pass@10, even comparable bug-fixing capabilities centralized learning. Moreover, negligible impact implies effectively collaborate despite diverse distributions. Different demonstrate unique strengths across LLMs, suggesting tailoring optimization process specific LLM characteristics improve

Language: Английский

Citations

Structuring Semantic‐Aware Relations Between Bugs and Patches for Accurate Patch Evaluation DOI

Lingxiao Zhao, Hui Li, Yongqian Chen

et al.

Journal of Software Evolution and Process, Journal Year: 2025, Volume and Issue: 37(2)

Published: Feb. 1, 2025

ABSTRACT Patches can help fix security vulnerabilities and optimize software performance, thereby enhancing the quality of software. Unfortunately, patches generated by automated program repair tools are not always correct, as they may introduce new bugs or fail to fully rectify original issue. Various methods for evaluating patch correctness have been proposed. However, most face challenge capturing long‐distance dependencies in evaluation, which leads a decline predictive performance models. To address challenge, this paper presents method named Qamhaen evaluate APR. Specifically, text embedding component across functions evaluation using bug reports descriptions inputs instead code snippets. BERT is employed pretraining capture these dependencies, followed an additional multihead self‐attention mechanism further feature extraction. Similarity evaluator devises similarity calculation assess effectiveness resolving issues outlined reports. Comprehensive experiments conducted on dataset containing 9135 assessment metric, extensive demonstrate that outperforms baseline terms overall AUC , F1 +Recall ‐Recall Precision . For example, compared baseline, achieves 0.691, representing improvements 24.2%, 22.1%, 6.3% over methods, respectively.

Language: Английский

Citations

Show Me Why It’s Correct: Saving 1/3 of Debugging Time in Program Repair with Interactive Runtime Comparison DOI

Ruixin Wang, Zhongkai Zhao,

Le Fang

et al.

Proceedings of the ACM on Programming Languages, Journal Year: 2025, Volume and Issue: 9(OOPSLA1), P. 1831 - 1857

Published: April 9, 2025

Automated Program Repair (APR) holds the promise of alleviating burden debugging and fixing software bugs. Despite this, developers still need to manually inspect each patch confirm its correctness, which is tedious time-consuming. This challenge exacerbated in presence plausible patches, accidentally pass test cases but may not correctly fix bug. To address this challenge, we propose an interactive approach called iFix facilitate understanding comparison based on their runtime difference. performs static analysis identify variables related buggy statement captures values during execution for patch. These are then aligned across different candidates, allowing users compare contrast behavior. evaluate iFix, conducted a within-subjects user study with 28 participants. Compared manual inspection state-of-the-art filtering technique, reduced participants’ task completion time by 36% 33% while also improving confidence 50% 20%, respectively. Besides, quantitative experiments demonstrate that improves ranking correct patches at least 39% compared other methods generalizable APR tools.

Language: Английский

Citations

CARLDA: An Approach for Stack Overflow API Mention Recognition Driven by Context and LLM‐Based Data Augmentation DOI

Zhang Zhang, Xinjun Mao, Shangwen Wang

et al.

Journal of Software Evolution and Process, Journal Year: 2025, Volume and Issue: 37(4)

Published: April 1, 2025

ABSTRACT The recognition of Application Programming Interface (API) mentions in software‐related texts is vital for extracting API‐related knowledge, providing deep insights into API usage and enhancing productivity efficiency. Previous research identifies two primary technical challenges this task: (1) differentiating APIs from common words (2) identifying morphological variants standard APIs. While learning‐based methods have demonstrated advancements addressing these challenges, they rely heavily on high‐quality labeled data, leading to another significant data‐related challenge: (3) the lack such data due substantial effort required labeling. To overcome paper proposes a context‐aware method named CARLDA. This approach utilizes key components, namely, Bidirectional Encoder Representations Transformers (BERT) Long Short‐Term Memory (BiLSTM), extract context at both word sequence levels, capturing syntactic semantic information address first challenge. For second challenge, it incorporates character‐level BiLSTM with an attention mechanism grasp global context, features third we developed specialized augmentation techniques using large language models (LLMs) tackle in‐library cross‐library shortages. These generate variety samples through targeted transformations (e.g., replacing tokens restructuring sentences) hybrid strategies combining real‐world generated while applying style rules replicate authentic programming contexts). Given uncertainty about quality LLM‐generated samples, also sample selection algorithms filter out low‐quality (i.e., incomplete or incorrectly samples). Moreover, specific datasets been constructed evaluate CARLDA's ability aforementioned challenges. Experimental results demonstrate that CARLDA significantly enhances F1 by 11.0% Matthews correlation coefficient (MCC) 10.0% compared state‐of‐the‐art methods, showing superior overall performance effectively tackling LLM‐based successfully yield alleviate

Language: Английский

Citations