Optimizing Search-Based Unit Test Generation with Large Language Models: An Empirical Study DOI

Danni Xiao,

Yimeng Guo, Yanhui Li

et al.

Published: July 18, 2024

Language: Английский

An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation DOI
Max Schäfer, Sarah Nadi, Aryaz Eghbali

et al.

IEEE Transactions on Software Engineering, Journal Year: 2023, Volume and Issue: 50(1), P. 85 - 105

Published: Nov. 28, 2023

Unit tests play a key role in ensuring the correctness of software. However, manually creating unit is laborious task, motivating need for automation. Large Language Models (LLMs) have recently been applied to various aspects software development, including their suggested use automated generation tests, but while requiring additional training or few-shot learning on examples existing tests. This paper presents large-scale empirical evaluation effectiveness LLMs test without manual effort. Concretely, we consider an approach where LLM provided with prompts that include signature and implementation function under test, along usage extracted from documentation. Furthermore, if generated fails, our attempts generate new fixes problem by re-prompting model failing error message. We implement TestPilot , adaptive LLM-based tool JavaScript automatically generates methods given project's API. evaluate using OpenAI's gpt3.5-turbo 25 npm packages total 1,684 API functions. The achieve median statement coverage 70.2% branch 52.8%. In contrast, state-of-the feedback-directed technique, Nessie, achieves only 51.3% 25.6% coverage. experiments excluding parts information included show all components contribute towards effective suites. also find 92.8% 's $\leq$ 50% similarity (as measured normalized edit distance), none them being exact copies. Finally, run two LLMs, older code-cushman-002 StarCoder which process publicly documented. Overall, observed similar results former (68.2% coverage), somewhat worse latter (54.0% suggesting influenced size set LLM, does not fundamentally depend specific model.

Language: Английский

Citations

69

Predicting typeScript type annotations and definitions with machine learning DOI Open Access
Ming‐Ho Yee

Published: Jan. 1, 2024

Type information is useful for developing large-scale software systems. Types help prevent bugs, but may be inflexible and hamper quick iteration on early prototypes. TypeScript, a syntactic superset of JavaScript, brings the best both worlds, allowing programmers to freely mix statically dynamically typed code, choose level type safety they wish opt into. However, migration, process migrating an untyped program version, has remained labour-intensive manual effort in practice. As first step towards automated effective there been interest applying machine learning narrower problem prediction. In this dissertation, I propose use partially migrate JavaScript programs by predicting annotations generating definitions. To support thesis, make three contributions. First, evaluating prediction checking generated instead computing accuracy. Second, fine-tune large language model with fill-in-the-middle capability fill predict annotations. Finally, similar approach generate missing definitions.--Author's abstract

Language: Английский

Citations

0

CrashJS: A NodeJS Benchmark for Automated Crash Reproduction DOI
Philip Oliver, Jens Dietrich, Craig Anslow

et al.

Published: April 15, 2024

Language: Английский

Citations

0

Optimizing Search-Based Unit Test Generation with Large Language Models: An Empirical Study DOI

Danni Xiao,

Yimeng Guo, Yanhui Li

et al.

Published: July 18, 2024

Language: Английский

Citations

0