Software Quality Journal, Journal Year: 2025, Volume and Issue: 33(1)
Published: Jan. 25, 2025
Language: Английский
Software Quality Journal, Journal Year: 2025, Volume and Issue: 33(1)
Published: Jan. 25, 2025
Language: Английский
Science China Information Sciences, Journal Year: 2024, Volume and Issue: 68(1)
Published: Dec. 24, 2024
Researchers have recently achieved significant advances in deep learning techniques, which turn has substantially advanced other research disciplines, such as natural language processing, image speech recognition, and software engineering. Various techniques been successfully employed to facilitate engineering tasks, including code generation, refactoring, fault localization. Many papers also presented top conferences journals, demonstrating the applications of resolving various tasks. However, although several surveys provided overall pictures application engineering, they focus more on that is, what kind are how models trained or fine-tuned for We still lack explaining subareas driven by well challenges opportunities each subarea. To this end, paper, we present first task-oriented survey learning-based It covers twelve major significantly impacted techniques. Such spread out through whole lifecycle development maintenance, requirements development, testing, developer collaboration. As believe may provide an opportunity revolutionize discipline providing one covering many possible can help future push forward frontier systematically.
Language: Английский
Citations
11Published: Feb. 6, 2024
As a dynamic programming language, Python has become increasingly popular in recent years. Although the type system of facilitates developers writing programs, it also brings errors at run-time which are prevalent yet not easy to fix. There exist rule-based approaches for automatically repairing errors. The can generate accurate patches covered by manually defined templates, but they require domain experts design patch synthesis rules and suffer from low template coverage real-world Learning-based alleviate manual efforts designing have due advances deep learning. Among learning-based approaches, prompt-based approach leverages knowledge base code pre-trained models via pre-defined prompts, obtains state-of-the-art performance general program repair tasks. However, such prompts do involve any specific clues errors, resulting limited effectiveness. How improve with error is challenging under-explored.
Language: Английский
Citations
10Published: Feb. 6, 2024
Automated program repair (APR) has achieved promising results, especially using neural networks. Yet, the overwhelming majority of patches produced by APR tools are confined to one single location. When looking at with repair, most them fail compile, while a few uncompilable ones go in right direction. In both cases, fundamental problem is ignore potential partial patches. this paper, we propose an iterative paradigm called ITER founded on concept improving until they become plausible and correct. First, iteratively improves single-location fixing compilation errors further refining previously generated code. Second, construct multi-location patches, fault localization re-execution. implemented for Java based battle-proven deep networks code representation. evaluated 476 bugs from 10 open-source projects Defects4J 2.0. succeeds repairing 15.5% them, including 9 uniquely repaired bugs.
Language: Английский
Citations
9ACM Transactions on Software Engineering and Methodology, Journal Year: 2025, Volume and Issue: unknown
Published: Jan. 24, 2025
Automated Program Repair (APR) has garnered significant attention due to its potential streamline the bug repair process for human developers. Recently, LLM-based APR methods have shown promise in repairing real-world bugs. However, existing often utilize patches generated by LLMs without further optimization, resulting reduced effectiveness lack of program-specific knowledge. Furthermore, evaluations these typically been conducted under assumption perfect fault localization, which may not accurately reflect their effectiveness. To address limitations, this paper introduces an innovative approach called GiantRepair . Our leverages insight that LLM-generated patches, although necessarily correct, offer valuable guidance patch generation process. Based on insight, first constructs skeletons from confine space, and then generates high-quality tailored specific programs through context-aware instantiating skeletons. evaluate performance our approach, we conduct two large-scale experiments. The results demonstrate only effectively repairs more bugs (an average 27.78% Defects4J v1.2 23.40% v2.0) than using directly, but also outperforms state-of-the-art at least 42 7 automated localization scenarios, respectively.
Language: Английский
Citations
1Published: Oct. 10, 2022
Learning-based program repair has achieved good results in a recent series of papers. Yet, we observe that the related work fails to some bugs because lack knowledge about 1) application domain being repaired, and 2) fault type repaired. In this paper, solve both problems by changing learning paradigm from supervised training self-supervised an approach called SelfAPR. First, SelfAPR generates samples on disk perturbing previous version enforcing neural model capture project-specific knowledge. This is different based mined past commits. Second, executes all extracts encodes test execution diagnostics into input representation, steering fix kind fault. existing studies only consider static source code as input. We implement evaluate it systematic manner. generate 1 039 873 obtained 17 open-source projects. 818 Defects4J, correctly repairs 110 them, outperforming approaches.
Language: Английский
Citations
33Published: July 15, 2022
Automatic Program Repair (APR) aims at fixing buggy source code with less manual debugging efforts, which plays a vital role in improving software reliability and development productivity. Recent APR works have achieved remarkable progress via applying deep learning (DL), particularly neural machine translation (NMT) techniques. However, we observe that existing DL-based models suffer from least two severe drawbacks: (1) Most of them can only generate patches for single programming language, as result, to repair multiple languages, build train many repairing models. (2) are developed offline. Therefore, they won't function when there new-coming requirements.
Language: Английский
Citations
29IEEE Transactions on Software Engineering, Journal Year: 2024, Volume and Issue: 50(3), P. 474 - 494
Published: Jan. 17, 2024
Automated program repair (APR) aims to fix software bugs automatically without human debugging efforts and plays a crucial role in development maintenance. Despite the recent significant progress number of fixed bugs, APR is still challenged by long-standing overfitting problem (i.e., generated patch plausible but overfitting). Various techniques have thus been proposed address problem. Recently, researchers employed BERT extract code features, which are then used train classifier for correctness prediction, indicating potential such pre-trained models reasoning about correctness. However, restricted feature extraction training benefiting from process, potentially generating sub-optimal vector representations patched snippets. In this paper, we propose APPT, model-based automated assessment technique both pre-training fine-tuning. APPT adopts model as encoder stack, followed an LSTM stack deep learning classifier. More importantly, fine-tuned conjunction with other components whole pipeline fully adapt it specifically Although our idea general can be built on various existing models, implemented based model. We conduct extensive experiment 1,183 Defects4J patches experimental results show that achieves prediction accuracy 79.7% recall 83.2%, outperforming state-of-the-art CACHE 4.3% 6.7%. Our additional investigation 49,694 real-world shows optimum performance (exceeding 99% five common metrics assessing classification techniques) compared representation techniques. further investigate impact each component find they all positively contribute e.g., fine-tuning process increase F1-score 10.22% 4.11%, respectively. also prove adopting advanced provide substantial advancement (e.g., GraphCodeBERT-based improves BERT-based 2.8% 3.3% precision AUC, respectively), highlighting generalizability APPT. Overall, study highlights promising future assess reduce manual inspection effort experts when deploying tools practice.
Language: Английский
Citations
8Published: April 12, 2024
The advances of deep learning (DL) have paved the way for automatic software vulnerability repair approaches, which effectively learn mapping from vulnerable code to fixed code. Nevertheless, existing DL-based methods face notable limitations: 1) they struggle handle lengthy code, 2) treat as natural language texts, neglecting its inherent structure, and 3) do not tap into valuable expert knowledge present in system. To address this, we propose VulMaster, a Transformer-based neural network model that excels at generating repairs by comprehensively understanding entire irrespective length. This also integrates diverse information, encompassing structures CWE We evaluated VulMaster on real-world C/C++ dataset comprising 1,754 projects with 5,800 functions. experimental results demonstrated exhibits substantial improvements compared learning-based state-of-the-art approach. Specifically, improves EM, BLEU, CodeBLEU scores 10.2% 20.0%, 21.3% 29.3%, 32.5% 40.9%, respectively.
Language: Английский
Citations
7Proceedings of the ACM on software engineering., Journal Year: 2024, Volume and Issue: 1(FSE), P. 1471 - 1493
Published: July 12, 2024
Large language models (LLMs) have shown impressive effectiveness in various software engineering tasks, including automated program repair (APR). In this study, we take a deep dive into bug localization and utilizing LLMs. contrast to many learning-based APR methods that assume known locations, rely on line-level tools, or address prediction fixing one step, our approach uniquely employs LLMs predict location at the token level subsequently utilizes them for fixing. This methodological separation of using different enables effective integration diverse contextual information improved incorporation inductive biases. We introduce Toggle: Token-Granulated Bug Localization Repair, comprehensive framework integrates model, an adjustment model tokenizer inconsistencies, bug-fixing model. Toggle takes buggy function as input generates complete corrected function. investigate styles prompting identify most prompts better utilize bias significantly outperform others. achieves new state-of-the-art (SOTA) performance CodeXGLUE code refinement benchmark, exhibits comparable several other widely-used datasets, Defects4J. Defects4J consistently ranks above methods, achieving superior results Top-10, Top-30, Top-50, Top-100 metrics. Besides examining Toggle’s generalizability unseen data, evaluating prompts, also impact additional such lines comments localization, explore importance Our extensive experiments offer valuable insights answers critical research questions.
Language: Английский
Citations
7Published: April 12, 2024
Gradual typing enables developers to annotate types of their own choosing, offering a flexible middle ground between no type annotations and fully statically typed language. As more code bases get type-annotated, static checkers detect an increasingly large number errors. Unfortunately, fixing these errors requires manual effort, hampering the adoption gradual in practice. This paper presents PyTy, automated program repair approach targeted at detectable Python. The problem repairing deserves specific attention because it exposes particular patterns, offers warning message with hints about where how apply fix, checking serves as automatic way validate fixes. We addresses this through three contributions: (i) empirical study that investigates fix Python errors, showing diverse set strategies some recurring patterns; (ii) automatically extract error fixes, which us create dataset 2,766 error-fix pairs from 176 GitHub repositories, named PyTyDefects; (iii) first learning-based technique for Motivated by relative data scarcity problem, neural model core PyTy is trained via cross-lingual transfer learning. Our evaluation shows fixes ten frequent categories successfully addressing 85.4% 281 real-world effectiveness outperforms state-of-the-art language models asked (by 2.1x) complements previous aimed manifest runtime. Finally, 20 out 30 pull requests PyTy-suggested have been merged developers, usefulness
Language: Английский
Citations
5