Investigating Developers' Contributions to Test Smell Survivability: A Study of Open-Source Projects DOI
Denivan Campos, Luana Martins, Carla Bezerra

et al.

Published: Sept. 25, 2023

Open-source software (OSS) projects rely on core and peripheral developers to develop, release, maintain software. The former group plays a crucial role in initiating the project making key decisions, while latter contributes less frequently has little decision-making power. Prior studies have explored relationship between developer experience test code quality. However, there is limited empirical evidence regarding survivability of smells during evolution maintenance. In this study, we investigate developers' case refactorings OSS projects. We empirically studied four Java projects, which identified using manual automated approaches analyzed authorship insertion removal smells. Our findings reveal that are commonly inserted class creation, 10.39% them removed 366 2,911 days. While remove more smells, different types

Language: Английский

A proposal and assessment of an improved heuristic for the Eager Test smell detection DOI
Huynh Khanh Vi Tran, Nauman bin Ali, Michael Unterkalmsteiner

et al.

Journal of Systems and Software, Journal Year: 2025, Volume and Issue: unknown, P. 112438 - 112438

Published: March 1, 2025

Language: Английский

Citations

0

On the diffusion of test smells and their relationship with test code quality of Java projects DOI
Luana Martins, Heitor Costa, Ivan Machado

et al.

Journal of Software Evolution and Process, Journal Year: 2023, Volume and Issue: 36(4)

Published: Jan. 18, 2023

Abstract Test smells are considered bad practices that can reduce the test code quality, thus harming software testing goals and maintenance activities. Prior studies have investigated diffusion of their impact on maintainability. However, we cannot directly compare outcomes as most them use customized datasets. In response, introduced TSSM (Test Smells Structural Metrics) dataset, containing detected using JNose tool structural metrics (test production code) calculated with CK 13,703 open‐source Java systems from GitHub. addition, perform an empirical study to investigate relationship between a large‐scale dataset. We split projects into three clusters analyze distribution smells, co‐occurrences among correlation code. The ratio smelly classes specific smell is similar clusters, but could observe significant difference in number them. Sleepy , Mystery Guest Resource Optimism rarely occur last two strongly correlated, indicating those more severe than others. Our results point out moderate high complexity, large size, coupling code, they also negatively affect its quality. To support further studies, made our dataset publicly available.

Language: Английский

Citations

7

Do the Test Smells Assertion Roulette and Eager Test Impact Students’ Troubleshooting and Debugging Capabilities? DOI
Wajdi Aljedaani, Mohamed Wiem Mkaouer, Anthony Peruma

et al.

Published: May 1, 2023

To ensure the quality of a software system, developers perform an activity known as unit testing, where they write code (known test cases) that verifies individual units make up system. Like production code, cases are subject to bad programming practices, smells, hurt maintenance activities. An essential part most activities is program comprehension which involves reading understand its behavior fix issues or update features. In this study, we conduct controlled experiment with 96 undergraduate computer science students investigate impact two common types namely Assertion Roulette and Eager Test, on student's ability debug troubleshoot case failures. Our findings show take longer correct errors in when smells present their associated cases, especially Roulette. We envision our supporting academia better equipping knowledge resources writing maintaining high-quality cases. experimental materials available online 1 https://wajdialjedaani.github.io/testsmellstd/

Language: Английский

Citations

6

Manual Tests Do Smell! Cataloging and Identifying Natural Language Test Smells DOI
Elvys Soares, Manoel Aranda,

Naelson Oliveira

et al.

Published: Oct. 26, 2023

Background: Test smells indicate potential problems in the design and implementation of automated software tests that may negatively impact test code maintainability, coverage, reliability. When poorly described, manual written natural language suffer from related problems, which enable their analysis point view smells. Despite possible prejudice to manually tested products, little is known about tests, results many open questions regarding types, frequency, harm language. Aims: Therefore, this study aims contribute a catalog for tests. Method: We perform two-fold empirical strategy. First, an exploratory three systems: Ubuntu Operational System, Brazilian Electronic Voting Machine, User Interface large smartphone manufacturer. use our findings propose eight identification rules based on syntactical morphological text analysis, validating with 24 in-company engineers. Second, using proposals, we create tool Natural Language Processing (NLP) analyze subject systems' results. Results: observed occurrence A survey professionals showed 80.7% agreed definitions examples. Our NLP-based achieved precision 92%, recall 95%, f-measure 93.5%, its execution evidenced 13,169 occurrences cataloged analyzed systems. Conclusion: novel detection strategies better explore capabilities current NLP mechanisms promising reduced effort different idioms.

Language: Английский

Citations

4

The Lost World: Characterizing and Detecting Undiscovered Test Smells DOI Open Access
Yanming Yang, Xing Hu, Xin Xia

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2023, Volume and Issue: 33(3), P. 1 - 32

Published: Nov. 20, 2023

Test smell refers to poor programming and design practices in testing widely spreads throughout software projects. Considering test smells have negative impacts on the comprehension maintenance of code even make code-under-test more defect-prone, it thus has great importance mining, detecting, refactoring them. Since Deursen et al. introduced definition “test smell”, several studies worked discovering new from specifications practitioners’ experience. Indeed, many bad are “observed” by developers during creating scripts rather than through academic research discussed engineering community (e.g., Stack Overflow) [ 70 , 94 ]. However, no prior explored discussions, formally defined them as types, analyzed their characteristics, which plays a role for knowing these avoiding using development. Therefore, we pick up those challenges act working systematic methods explore types one most mainstream developers’ Q&A platforms, i.e., Overflow. We further investigate harmfulness analyze possible solutions eliminating find that some hard fix failed cases trace failing reasons. To exacerbate matters, identified two pose risk accuracy cases. Next, develop detector detect software. The is composed six detection different types. These both wrapped with set syntactic rules based patterns extracted styles. manually construct dataset seven popular Java projects evaluate effectiveness our it. experimental results show achieves high performance precision, recall, F1 score. Then, utilize 919 real-world whether prevalent practice. observe spread 722 out projects, demonstrates they Finally, validate usefulness practice, submit 56 issue reports 53 smells. Our achieve 76.4% acceptance conducting sentiment analysis replies. evaluations confirm prevalence practicality

Language: Английский

Citations

4

Investigating the readability of test code DOI Creative Commons
Dietmar Winkler, Pirmin Urbanke, Rudolf Ramler

et al.

Empirical Software Engineering, Journal Year: 2024, Volume and Issue: 29(2)

Published: Feb. 26, 2024

Abstract Context The readability of source code is key for understanding and maintaining software systems tests. Although several studies investigate the code, there limited research specifically on test related influence factors. Objective In this paper, we aim at investigating factors that from an academic perspective based scientific literature sources complemented by practical views, as discussed in grey literature. Methods First, perform a Systematic Mapping Study (SMS) with focus Second, extend study reviewing aspects understandability. Finally, conduct controlled experiment selected set cases to collect additional knowledge practice. Results result SMS includes 19 primary further analysis. search reveals 62 information readability. Based analysis these sources, identified combined 14 code. 7 were found literature, while some mainly academia (2) or industry (5) only overlap. practically relevant showed investigated have significant impact half cases. Conclusion Our review interest consensus However, also practitioners. For able confirm first experiment. Therefore, see need bring together viewpoints achieve common view

Language: Английский

Citations

1

A comprehensive catalog of refactoring strategies to handle test smells in Java-based systems DOI
Luana Martins, Taher Ahmed Ghaleb, Heitor Costa

et al.

Software Quality Journal, Journal Year: 2024, Volume and Issue: 32(2), P. 641 - 679

Published: March 8, 2024

Language: Английский

Citations

1

Shaken, Not Stirred: How Developers Like Their Amplified Tests DOI
Carolin Brandt, Ali Khatami, Mairieli Wessel

et al.

IEEE Transactions on Software Engineering, Journal Year: 2024, Volume and Issue: 50(5), P. 1264 - 1280

Published: March 22, 2024

Test amplification makes systematic changes to existing, manually written tests provide complementary an automated test suite. We consider developer-centric amplification, where the developer explores, judges and edits amplified before adding them their maintained However, it is as yet unclear which kind of selection editing steps developers take including into In this paper we conduct open source contribution study, amplifying Java projects from GitHub. report deficiencies observe in while filtering 39 pull requests with tests. present a detailed analysis maintainer's feedback regarding proposed changes, requested information, expressed judgment. Our observations basis for practitioners informed decision on whether adopt amplification. As several are based developer's understanding test, conjecture that should invest supporting understand

Language: Английский

Citations

1

Evaluating Large Language Models in Detecting Test Smells DOI

K. Lucas,

Rohit Gheyi, Elvys Soares

et al.

Published: Sept. 30, 2024

Test smells are coding issues that typically arise from inadequate practices, a lack of knowledge about effective testing, or deadline pressures to complete projects. The presence test can negatively impact the maintainability and reliability software. While there tools use advanced static analysis machine learning techniques detect smells, these often require effort be used. This study aims evaluate capability Large Language Models (LLMs) in automatically detecting smells. We evaluated ChatGPT-4, Mistral Large, Gemini Advanced using 30 types across codebases seven different programming languages collected literature. ChatGPT-4 identified 21 17 types, while detected 15 LLMs demonstrated potential as valuable tool identifying

Language: Английский

Citations

1

A Block-Based Testing Framework for Scratch DOI
Patric Feldmeier, Gordon Fraser, Ute Heuer

et al.

Published: Nov. 12, 2024

Block-based programming environments like Scratch are widely used in introductory courses. They facilitate learning pivotal concepts by eliminating syntactical errors, but logical errors that break the desired program behaviour nevertheless possible. Finding such requires testing, i.e., running and checking its behaviour. In many environments, this step can be automated providing executable tests as code; Scratch, testing only done manually invoking events through user input observing rendered stage. While is arguably sufficient for learners, lack of may inhibitive teachers wishing to provide feedback on their students' solutions. order address issue, we introduce a new category blocks enables creation tests. With these blocks, students alike create receive directly within environment using familiar block-based logic. To enable batch processing sets student solutions, extend interface with an accompanying test interface. We evaluated framework 28 who created popular game subsequently assess implementations. An overall accuracy 0.93 teachers' compared evaluating functionality 21 solutions demonstrates able effectively use A subsequent survey confirms consider approach useful.

Language: Английский

Citations

1