Cited by Investigating Developers' Contributions to Test Smell Survivability: A Study of Open-Source Projects

JUGE: An infrastructure for benchmarking Java unit test generators DOI

Xavier Devroey, Alessio Gambi, Juan Pablo Galeotti

et al.

Software Testing Verification and Reliability, Journal Year: 2022, Volume and Issue: 33(3)

Published: Dec. 20, 2022

Summary Researchers and practitioners have designed implemented various automated test case generators to support effective software testing. Such exist for languages (e.g., Java, C#, or Python) platforms desktop, web, mobile applications). The exhibit varying effectiveness efficiency, depending on the testing goals they aim satisfy unit‐testing of libraries versus system‐testing entire applications) underlying techniques implement. In this context, need be able compare different identify most suited one their requirements, while researchers seek future research directions. This can achieved by systematically executing large‐scale evaluations generators. However, such empirical is not trivial requires substantial effort select appropriate benchmarks, setup evaluation infrastructure, collect analyse results. Software Note, we present our JUnit Generation Benchmarking Infrastructure ( JUGE ) supporting (search‐based, random‐based, symbolic execution, etc.) seeking automate production unit tests purposes (validation, regression testing, fault localization, etc.). primary goal reduce overall benchmarking effort, ease comparison several generators, enhance knowledge transfer between academia industry standardizing process. Since 2013, editions a tool competition, co‐located with Search‐Based Testing Workshop, taken place where was used evolved. As result, an increasing amount tools (over 10) from been evaluated , matured over years, allowed identification Based experience gained competitions, discuss expected impact in improving approaches generation industry. Indeed, infrastructure demonstrated implementation design that flexible enough enable integration additional tools, which practical developers allows experiment new advanced approaches.

Language: Английский

Citations

ROME: Testing Image Captioning Systems via Recursive Object Melting DOI

Boxi Yu, Zhiqing Zhong, Jiaqi Li

et al.

Published: July 12, 2023

Image captioning (IC) systems aim to generate a text description of the salient objects in an image. In recent years, IC have been increasingly integrated into our daily lives, such as assistance for visually-impaired people and generation Microsoft Powerpoint. However, even cutting-edge (e.g., Azure Cognitive Services) algorithms OFA) could produce erroneous captions, leading incorrect important objects, misunderstanding, threats personal safety. The existing testing approaches either fail handle complex form system output (i.e., sentences natural language) or unnatural images test cases. To address these problems, we introduce Recursive Object MElting (Rome), novel metamorphic approach validating systems. Different from that cases by inserting which easily make generated unnatural, Rome melts remove inpaint) objects. assumes object set caption image includes after melting. Given image, can recursively its different pairs images. We use one widely-adopted API four state-of-the-art (SOTA) algorithms. results show look much more than SOTA they achieve comparable naturalness original Meanwhile, generating using 226 seed images, reports total 9,121 issues with high precision (86.47%-92.17%). addition, further utilize retrain Oscar, improves performance across multiple evaluation metrics.

Language: Английский

Citations

An Exploratory Study on the Usage and Readability of Messages Within Assertion Methods of Test Cases DOI

Taryn Takebayashi,

Anthony Peruma, Mohamed Wiem Mkaouer

et al.

Published: May 1, 2023

Unit testing is a vital part of the software development process and involves developers writing code to verify or assert production code. Furthermore, help comprehend test case troubleshoot issues, have option provide message that explains reason for assertion failure. In this exploratory empirical study, we examine characteristics messages contained in methods 20 open-source Java systems. Our findings show while rarely utilize supplying message, those who do, either compose it only string literals, identifiers, combination both types. Using standard English readability measuring techniques, observe beginner's knowledge required understand containing 4th -grade education level composed literals. We also discuss shortcomings with using such techniques common anti-patterns construction. envision our results incorporated into quality tools appraise understandability messages.

Language: Английский

Citations

Guess What: Test Case Generation for Javascript with Unsupervised Probabilistic Type Inference DOI

Dimitri Stallenberg, Mitchell Olsthoorn, Annibale Panichella

et al.

Lecture notes in computer science, Journal Year: 2022, Volume and Issue: unknown, P. 67 - 82

Published: Jan. 1, 2022

Language: Английский

Citations

Higher Fault Detection Through Novel Density Estimators in Unit Test Generation DOI

Annibale Panichella, Mitchell Olsthoorn

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 18 - 32

Published: Jan. 1, 2024

Language: Английский

Citations

Test Smells Learning by a Gamification Approach DOI

Anna Rita Fasolino, Porfirio Tramontana

Published: Sept. 13, 2024

Language: Английский

Citations

Toward a Language-Agnostic Approach to Detect Test Smells DOI

Públio Silva,

Carla Bezerra,

Ivan Machado

et al.

Published: Sept. 30, 2024

Tests play a crucial role in software development by ensuring code quality. However, test can suffer from “smells” — poor implementation choices that hinder maintainability and evolution. Numerous studies have addressed smells various programming languages, proposing tools for detecting them Java, C++, Scala, others. These employ techniques such as information retrieval, metrics analysis, abstract syntax tree (AST) parsing. their focus on specific languages limits generalizability applicability to other frameworks. This challenge is similar issues found smell detection static analysis. Therefore, this work proposes language-agnostic approach detect smells. Our leverages AST parsing extract relevant the code, followed based extracted data. method aims facilitate of across frameworks, enhancing tool’s usability. To check viability our approach, we created proof concept using two different languages.

Language: Английский

Citations

How Aware Are We of Test Smells in Quantum Software Systems? A Preliminary Empirical Evaluation DOI

Tássio Virgínio, Larissa Rocha Soares, Carla Bezerra

et al.

Published: Nov. 5, 2024

Language: Английский

Citations

A manual categorization of new quality issues on automatically-generated tests DOI

Geraldine Galindo-Gutierrez,

Maximiliano Narea Carvajal,

Alison Fernandez Blanco

et al.

Published: Oct. 1, 2023

Diverse studies have analyzed the quality of automatically generated test cases by using smells as main attribute. But recent work reported that tests might suffer from a number issues not considered previously, thus suggesting all been identified yet. Little is known about these and their frequency within tests. In this paper, we report on manual analysis an external dataset consisting 2,340 This aimed at detecting new issues, covered past recognized smells. We use thematic to group categorize found. As result, propose taxonomy 13 grouped in four categories. also present eight recommendations generators may consider improve usefulness additional contribution, our results suggest (i) should be evaluated only themselves, but considering tested code; (ii) flaws are unlikely found manually created require specific checking tools.

Language: Английский

Citations

Automatic Generation of Smell-free Unit Tests DOI

João Afonso, José Campos

Published: May 1, 2023

Automated test generation tools, such as EvoSuite, typically aim to generate tests that maximize code coverage and do not adequately consider non-coverage aspects may be relevant for developers, e.g., test's quality. Hence, automatically generated are often affected by test-specific bad programming practices, i.e., smells, hinder the quality of source and, ultimately, under test. Although EvoSuite uses secondary criteria a post-processing procedure optimize improve readability tests, it does explicitly usage good practices. Thus, in this paper, we propose novel approach assist EvoSuite's search algorithm generating smell-free out box. To aim, first compile set 54 smell metrics from several sources. Secondly, systematically identify 30 smells affect eight cannot computed. Thirdly, incorporate remaining 16 into empirically only 14 tool (e.g., Indirect Testing). Fourthly, describe integrate an EvoSuite. Finally, conduct empirical study (i) understand what extend default mechanisms leads fewer smelly tests. (ii) assess whether our And (iii) how affects fault detection effectiveness Our results report can 8.58% without significantly compromising their or effectiveness.

Language: Английский

Citations