TACAI: an intermediate representation based on abstract interpretation DOI
Michael Reif,

Florian Kübler,

Dominik Helm

et al.

Published: June 1, 2020

Most Java static analysis frameworks provide an intermediate presentation (IR) of Bytecode to facilitate the development analyses. While such IRs are often based on three-address code, transformation itself is a great opportunity apply optimizations transformed as constant propagation.

Language: Английский

Scented since the beginning: On the diffuseness of test smells in automatically generated test code DOI
Giovanni Grano, Fabio Palomba, Dario Di Nucci

et al.

Journal of Systems and Software, Journal Year: 2019, Volume and Issue: 156, P. 312 - 327

Published: July 9, 2019

Language: Английский

Citations

46

Judge: identifying, understanding, and evaluating sources of unsoundness in call graphs DOI
Michael Reif,

Florian Kübler,

Michael Eichberg

et al.

Published: July 10, 2019

Call graphs are widely used; in particular for advanced control- and data-flow analyses. Even though many call graph algorithms with different precision scalability properties have been proposed, a comprehensive understanding of sources unsoundness, their relevance, the capabilities existing this respect is missing. To address problem, we propose Judge, toolchain that helps unsoundness improving soundness graphs. In several experiments, use Judge an extensive test suite related to (a) compute capability profiles implementations Soot, WALA, DOOP, OPAL, (b) determine prevalence language features APIs affect modern Java Bytecode, (c) compare OPAL – highlighting important differences implementations, (d) evaluate necessary effort achieve project-specific reasonable sound We show soundness-relevant features/APIs frequently used support them differs vastly, up point where comparing computed by same base (e.g., RTA) but frameworks bogus. also can users establishing effort.

Language: Английский

Citations

38

On the recall of static call graph construction in practice DOI
Li Sui, Jens Dietrich, Amjed Tahir

et al.

Published: June 27, 2020

Static analyses have problems modelling dynamic language features soundly while retaining acceptable precision. The problem is well-understood in theory, but there little evidence on how this impacts the analysis of real-world programs. We studied issue for call graph construction a set 31 Java programs using an oracle actual program behaviour recorded from executions built-in and synthesised test cases with high coverage, measured recall that being achieved by various static algorithms configurations, investigated which lead to false negatives.

Language: Английский

Citations

32

Promoting open science in test-driven software experiments DOI Creative Commons
Marcus Kessel, Colin Atkinson

Journal of Systems and Software, Journal Year: 2024, Volume and Issue: 212, P. 111971 - 111971

Published: March 12, 2024

A core principle of open science is the clear, concise and accessible publication empirical data, including "raw" observational data as well processed results. However, in software engineering there are no established standards (de jure or de facto) for representing "opening" observations collected test-driven experiments — that is, involving execution subjects controlled scenarios. Execution therefore usually represented ad hoc ways, often making it abstruse difficult to access without significant manual effort. In this paper we present new structures designed address problem by clearly defining, correlating stimuli responses used execute experiments. To demonstrate their utility, show how they can be promote repetition, replication reproduction experimental evaluations AI-based code completion tools. We also proposed facilitate incremental expansion sets, thus repurposing addressing research questions.

Language: Английский

Citations

3

Systematic evaluation of the unsoundness of call graph construction algorithms for Java DOI
Michael Reif,

Florian Kübler,

Michael Eichberg

et al.

Published: July 16, 2018

Call graphs are at the core of many static analyses ranging from detection unused methods to advanced control-and data-flow analyses. Therefore, a comprehensive understanding precision and recall respective is crucial enable an assessment which call-graph construction algorithms suited in analysis scenario. For example, malware often obfuscated tries hide its intent by using Reflection. that do not represent reflective method calls are, therefore, limited use when analyzing such apps.

Language: Английский

Citations

23

On the Soundness of Call Graph Construction in the Presence of Dynamic Language Features - A Benchmark and Tool Evaluation DOI
Li Sui, Jens Dietrich,

Michael Ray. Emery

et al.

Lecture notes in computer science, Journal Year: 2018, Volume and Issue: unknown, P. 69 - 88

Published: Jan. 1, 2018

Language: Английский

Citations

21

Identifying Java calls in native code via binary scanning DOI

George Fourtounis,

Leonidas Triantafyllou,

Yannis Smaragdakis

et al.

Published: July 13, 2020

Current Java static analyzers, operating either on the source or bytecode level, exhibit unsoundness for programs that contain native code. We show Native Interface (JNI) specification, which is used by to interoperate with code, principled enough permit reasoning about effects of code program execution when it comes call-backs. Our approach consists disassembling binaries, recovering symbol information corresponds method signatures, and producing a model statically exercising these call-backs appropriate mock objects. The manages recover virtually all calls in both Android desktop applications—(a) achieving 100% native-to-application call-graph recall large applications (Chrome, Instagram) (b) capturing full call-back behavior XCorpus suite programs.

Language: Английский

Citations

19

Static analysis of Java dynamic proxies DOI

George Fourtounis,

George Kastrinis,

Yannis Smaragdakis

et al.

Published: July 12, 2018

The dynamic proxy API is one of Java's most widely-used features, permitting principled run-time code generation and link- ing. Dynamic proxies can implement any set interfaces for- ward method calls to a special object that handles them reflectively. flexibility proxies, however, comes at the cost having dynamically generated layer bytecode cannot be penetrated by current static analyses. In this paper, we observe stylized enough permit analysis. We show how semantics modeled in straightforward manner as logical rules Doop analysis framework. This concise enables Doop's standard analyses process behind proxies. evaluate our approach analyzing XCorpus, corpus real-world Java programs: fully handle 95% its reported creation sites. Our handling results significant portions previously unreachable or incompletely- code.

Language: Английский

Citations

20

Identifying Challenges for OSS Vulnerability Scanners - A Study & Test Suite DOI
Andreas Dann, Henrik Plate, Ben Hermann

et al.

IEEE Transactions on Software Engineering, Journal Year: 2021, Volume and Issue: 48(9), P. 3613 - 3625

Published: Aug. 4, 2021

The use of vulnerable open-source dependencies is a known problem in today's software development. Several vulnerability scanners to detect known-vulnerable appeared the last decade, however, there exists no case study investigating impact development practices, e.g., forking, patching, re-bundling, on their performance. This paper studies (i) types modifications that may affect and (ii) performance scanners. Through an empirical 7,024 Java projects developed at SAP , we identified four modifications: re-compilation, metadata-removal re-packaging. In particular, found more than 87 percent (56 percent, resp.) classes considered occur Maven Central re-bundled (re-packaged, form. We assessed these OWASP Dependency-Check (OWASP) Eclipse Steady, GitHub Security Alerts, three commercial results show none able handle all identified. Finally, present xmlns:xlink="http://www.w3.org/1999/xlink">Achilles novel test suite with 2,505 cases allow replicating dependencies.

Language: Английский

Citations

13

JEMMA: An extensible Java dataset for ML4Code applications DOI Creative Commons
Anjan Karmakar, Miltiadis Allamanis, Romain Robbes

et al.

Empirical Software Engineering, Journal Year: 2023, Volume and Issue: 28(2)

Published: March 1, 2023

Abstract Machine Learning for Source Code () is an active research field in which extensive experimentation needed to discover how best use source code’s richly structured information. With this mind, we introduce : An Extensible Java Dataset Applications, a large-scale, diverse, and high-quality dataset targeted at . Our goal with lower the barrier entry by providing building blocks experiment code models tasks. comes considerable amount of pre-processed information such as metadata, representations (e.g., tokens, ASTs, graphs), several properties metrics, static analysis results) 50,000 projects from dataset, over 1.2 million classes 8 methods. also extensible allowing users add new evaluate tasks on them. Thus, becomes workbench that researchers can novel operating code. To demonstrate utility report results two empirical studies our data, ultimately showing significant work lies ahead design context-aware reason broader network entities software project—the very task designed help with.

Language: Английский

Citations

4