Symbolic Execution with Test Cases Generated by Large Language Models DOI

Jiahe Xu,

Jingwei Xu, Taolue Chen

et al.

Published: July 1, 2024

Language: Английский

Clover: Closed-Loop Verifiable Code Generation DOI
Chuyue Sun,

Ying Sheng,

Oded Padon

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 134 - 155

Published: Jan. 1, 2024

Language: Английский

Citations

7

Test Oracle Automation in the Era of LLMs DOI Open Access
Facundo Molina, Alessandra Gorla, Marcelo d’Amorim

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 27, 2025

The effectiveness of a test suite in detecting faults highly depends on the quality its oracles. Large Language Models (LLMs) have demonstrated remarkable proficiency tackling diverse software testing tasks. This paper aims to present roadmap for future research use LLMs oracle automation. We discuss progress made field automation before introduction LLMs, identifying main limitations and weaknesses existing techniques. Additionally, we recent studies this task, highlighting challenges that arise from their use, e.g., how assess usefulness generated conclude with discussion about directions opportunities LLM-based

Language: Английский

Citations

0

RAGVA: Engineering retrieval augmented generation-based virtual assistants in practice DOI Creative Commons
Rui Yang, Michael C. Fu, Chakkrit Tantithamthavorn

et al.

Journal of Systems and Software, Journal Year: 2025, Volume and Issue: unknown, P. 112436 - 112436

Published: March 1, 2025

Language: Английский

Citations

0

Challenges and Paths Towards AI for Software Engineering DOI
Alex Gu, Naman Jain, Weizhong Li

et al.

Published: April 4, 2025

AI for software engineering has made remarkable progress recently, becoming a notable success within generative AI. Despite this, there are still many challenges that need to be addressed before automated reaches its full potential. It should possible reach high levels of automation where humans can focus on the critical decisions what build and how balance difficult tradeoffs while most routine development effort is away. Reaching this level will require substantial research efforts across academia industry. In paper, we aim discuss towards in threefold manner. First, provide structured taxonomy concrete tasks engineering, emphasizing other beyond code generation completion. Second, outline several key bottlenecks limit current approaches. Finally, an opinionated list promising directions toward making these bottlenecks, hoping inspire future rapidly maturing field.

Language: Английский

Citations

0

UTFix: Change Aware Unit Test Repairing using LLM DOI Open Access
Shanto Rahman, Sachit Kuhar, Berk Çirişci

et al.

Proceedings of the ACM on Programming Languages, Journal Year: 2025, Volume and Issue: 9(OOPSLA1), P. 143 - 168

Published: April 9, 2025

Software updates, including bug repair and feature additions, are frequent in modern applications but they often leave test suites outdated, resulting undetected bugs increased chances of system failures. A recent study by Meta revealed that 14%-22% software failures stem from outdated tests fail to reflect changes the codebase. This highlights need keep sync with code ensure reliability. In this paper, we present UTFix, a novel approach for repairing unit when their corresponding focal methods undergo changes. UTFix addresses two critical issues: assertion failure reduced coverage caused method. Our leverages language models providing contextual information such as static slices, dynamic messages. We evaluate on our generated synthetic benchmarks (Tool-Bench), real-world benchmarks. Tool- Bench includes diverse popular open-source Python GitHub projects, where successfully repaired 89.2% achieved 100% 96 out 369 tests. On benchmarks, repairs 60% while achieving 19 30 To best knowledge, is first comprehensive focused evolving projects. contributions include development creation Tool-Bench demonstration effectiveness LLM-based addressing due evolution.

Language: Английский

Citations

0

RAG-Driven multiple assertions generation with large language models DOI
Zhuang Liu, Hailong Wang, Tongtong Xu

et al.

Empirical Software Engineering, Journal Year: 2025, Volume and Issue: 30(3)

Published: April 26, 2025

Language: Английский

Citations

0

Symbolic Execution with Test Cases Generated by Large Language Models DOI

Jiahe Xu,

Jingwei Xu, Taolue Chen

et al.

Published: July 1, 2024

Language: Английский

Citations

0