Effects of Program Representation on Pointer Analyses — An Empirical Study DOI Creative Commons
Jyoti Prakash, Abhishek Tiwari, Christian Hammer

et al.

Lecture notes in computer science, Journal Year: 2021, Volume and Issue: unknown, P. 240 - 261

Published: Jan. 1, 2021

Abstract Static analysis frameworks, such as Soot and Wala , are used by researchers to prototype compare program analyses. These frameworks vary on heap abstraction, modeling library classes, underlying intermediate representation (IR). Often, these variations pose a threat the validity of results implications comparing same implementation in different still unexplored. Earlier studies have focused precision, soundness, recall algorithms implemented frameworks; however, little no work has been done evaluate effects representation. In this work, we fill gap study impact pointer analysis. Unfortunately, existing metrics insufficient for comparison due their inability isolate each aspect Therefore, define two novel that measure analyses’ precision after isolating influence class-hierarchy Our establish minor differences class hierarchy IR do not significantly. Besides, they reveal sources unsoundness aid developing

Language: Английский

JuCify DOI Open Access
Jordan Samhi, Jun Gao, Nadia Daoudi

et al.

Proceedings of the 44th International Conference on Software Engineering, Journal Year: 2022, Volume and Issue: unknown

Published: May 21, 2022

Native code is now commonplace within Android app packages where it co-exists and interacts with Dex bytecode through the Java Interface to deliver rich functionalities. Yet, state-of-the-art static analysis approaches have mostly overlooked presence of such native code, which, however, may implement some key sensitive, or even malicious, parts behavior. This limitation state art a severe threat validity in large range analyses that do not complete view executable apps. To address this issue, we propose new advance ambitious research direction building unified model all The JuCify approach presented paper significant step towards model, extract merge call graphs make final readily-usable by common framework: our implementation, builds on Soot internal intermediate representation. We performed empirical investigations highlight how, without amount methods called from are "unreachable" apps' call-graphs, both goodware malware. Using JuCify, were able enable analyzers reveal cases malware relied hide invocation payment library other sensitive framework. Additionally, JuCify's enables tools achieve better precision recall detecting data leaks code. Finally, show using can find pass

Language: Английский

Citations

27

Static Analysis of JNI Programs via Binary Decompilation DOI
Ji-Hee Park, Sungho Lee, Jaemin Hong

et al.

IEEE Transactions on Software Engineering, Journal Year: 2023, Volume and Issue: 49(5), P. 3089 - 3105

Published: Feb. 2, 2023

JNI programs are widely used thanks to the combined benefits of C and Java programs. However, because understanding interaction behaviors between two different programming languages is challenging, program development difficult get right vulnerable security attacks. Thus, researchers have proposed static analysis source code detect bugs vulnerabilities in Unfortunately, such not applicable compiled that open-sourced or open-source containing third-party binary libraries. While JN-SAF, state-of-the-art analyzer for programs, can analyze code, it has several limitations due its symbolic execution summary-based bottom-up analysis. In this paper, we propose a novel approach statically without their using decompilation. Unlike JN-SAF analyzes binaries directly, our decompiles with decompiled an existing code. To decompile compilable precise JNI-interoperation-related types, improve decompilation tool by leveraging characteristics Our evaluation shows as almost same more than JN-SAF.

Language: Английский

Citations

11

A Picture is Worth 500 Labels: A Case Study of Demographic Disparities in Local Machine Learning Models for Instagram and TikTok DOI

Jack West,

Lea Thiemt,

Shimaa Ahmed

et al.

2022 IEEE Symposium on Security and Privacy (SP), Journal Year: 2024, Volume and Issue: 1, P. 369 - 387

Published: May 19, 2024

Language: Английский

Citations

4

The Dark Side of Native Code on Android DOI Creative Commons
Antonio Ruggia, Andrea Possemato, Savino Dambra

et al.

ACM Transactions on Privacy and Security, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 17, 2025

From a little research experiment to an essential component of military arsenals, malicious software has constantly been growing and evolving for more than three decades. On the other hand, from negligible market share, Android operating system is nowadays most widely used mobile system, becoming desirable target large-scale malware distribution. While scientific literature followed this trend, one aspect understudied: role native code in apps. apps are written high-level languages, but thanks Java Native Interface (JNI), also supports calling (C/C++) library functions. allowing strong positive impact performance perspective, it dramatically complicates its analysis because bytecode need different abstractions algorithms, they thus pose challenges limitations. Consequently, these difficulties often (ab)used hide payloads. In work, we propose novel methodology reverse engineering focusing on suspicious patterns related components, i.e., surreptitious that requires further inspection. We implemented static tool based such methodology, which can bridge “Java” worlds perform in-depth tag blocks responsible behavior. These tags benefit human facing task: clearly indicate part focus find code. Then, performed longitudinal over past ten years compared recent samples with actual top Google Play Store. Our work depicts typical behaviors modern malware, evolution, how abuses layer complicate analysis, especially dynamic loading anti-analysis techniques. Finally, show use case our tags: trained tested machine learning algorithm binary classification task. Even if does not imply malicious, classifier obtained remarkable F1-score 0.97, showing be helpful both humans machines.

Language: Английский

Citations

0

Modular Unification of Unilingual Pointer Analyses to Multilingual FFI-Based Programs DOI Creative Commons
Jyoti Prakash, Abhishek Tiwari, Christian Hammer

et al.

Science of Computer Programming, Journal Year: 2025, Volume and Issue: unknown, P. 103278 - 103278

Published: Feb. 1, 2025

Language: Английский

Citations

0

Declarative static analysis for multilingual programs using CodeQL DOI
Dongjun Youn, Sungho Lee, Sukyoung Ryu

et al.

Software Practice and Experience, Journal Year: 2023, Volume and Issue: 53(7), P. 1472 - 1495

Published: March 9, 2023

Summary Declarative static program analysis has become one of the widely‐used techniques. analyzers perform three steps: creating databases facts from source code, evaluating rules to generate new facts, and running queries over extract all information related specific properties via query systems. can easily target diverse programming languages by modifying only for languages. Because systems are independent languages, they reusable However, even when declarative support multiple do not currently multilingual programs written in two or more We propose a systematic methodology that extends analyzer supporting as well. The main idea is reuse existing components analyzer. Our approach first generates merged database consisting logical language spaces. It allows language‐specific derive corresponding space. Then, it defines language‐interoperation handle interoperation semantics. Finally, uses same system get results leveraging develop proof‐of‐concept extending CodeQL, which track dataflows across boundaries. evaluation shows successfully tracks Java‐C Python‐C boundaries detects genuine bugs real‐world programs.

Language: Английский

Citations

8

Fuzzing Android Native System Libraries via Dynamic Data Dependency Graph DOI
Xiaogang Zhu, Siyu Zhang, Chaoran Li

et al.

IEEE Transactions on Information Forensics and Security, Journal Year: 2024, Volume and Issue: 19, P. 3733 - 3744

Published: Jan. 1, 2024

Google suggests using only the APIs documented in Android SDK. However, many app developers still choose Java Native Interface (JNI) to access system libraries because of flexibility and freedom that non-SDK methods provide implementing complex functions. JNI may have unexpected consequences, including low-level bug-driven crashes. The bugs can propagate apps, further cost much time energy for debug them. We develop a fuzzing tool, called JDYNUZZ, exposes mitigate aftermath direct invocation JNI. To fuzz library, one needs not prepare appropriate inputs, but also deal with challenge maintaining correct sequence API calls, both syntactically semantically. solve challenge, crux JDYNUZZ is dynamic refinement data dependency graph, which gradually resolves problem syntactic semantic incorrectness when constructing sequences. achieves based on feature reflection, enables us dynamically modify sequences test different code regions. evaluate most recent version Open Source Project (AOSP), i.e ., android-12.0.0 r31. In our experiments, discovers 34 new libraries, all confirmed by Google.

Language: Английский

Citations

2

Challenges of Multilingual Program Specification and Analysis DOI
Carlo A. Furia, Abhishek Tiwari

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 124 - 143

Published: Oct. 29, 2024

Language: Английский

Citations

2

NCScope: hardware-assisted analyzer for native code in Android apps DOI Open Access
Hao Zhou, Shuohan Wu, Xiapu Luo

et al.

Published: July 15, 2022

More and more Android apps implement their functionalities in native code, so does malware. Although various approaches have been designed to analyze the code used by apps, they usually generate incomplete biased results due limitations obtaining analyzing high-fidelity execution traces memory data with low overheads. To fill gap, this paper, we propose develop a novel hardware-assisted analyzer for apps. We leverage ETM, hardware feature of ARM platform, eBPF, kernel component system, collect real relevant target design new methods scrutinize according collected data. show unique capability NCScope, apply it four applications that cannot be accomplished existing tools, including systematic studies on self-protection anti-analysis mechanisms implemented analysis corruption identification performance differences between functions code. The uncover only 26.8% analyzed financial implying security is far from expected. Meanwhile, 78.3% malicious under behaviors, suggesting NCScope very useful malware analysis. Moreover, can effectively detect bugs identify differences.

Language: Английский

Citations

10

Detecting and Measuring Aggressive Location Harvesting in Mobile Apps via Data-flow Path Embedding DOI Open Access
Haoran Lu, Qingchuan Zhao, Yongliang Chen

et al.

Proceedings of the ACM on Measurement and Analysis of Computing Systems, Journal Year: 2023, Volume and Issue: 7(1), P. 1 - 27

Published: Feb. 27, 2023

Today, location-based services have become prevalent in the mobile platform, where apps provide specific to a user based on his or her location. Unfortunately, can aggressively harvest location data with much higher accuracy and frequency than they need because coarse-grained access control mechanism currently implemented operating systems (e.g., Android) cannot regulate such behavior. This unnecessary collection violates minimization policy, yet no previous studies investigated privacy violations from this perspective, existing techniques are insufficient address violation. To fill knowledge gap, we take first step toward detecting measuring risk at scale. Particularly, annotate release thefirst dataset characterize those aggressive harvesting understand challenges of automatic detection classification. Next, present novel system, LocationScope, these by(i) uncovering how an app collects locations use through fine-tuned value set analysis technique,(ii) recognizing fine-grained provides via embedding data-flow paths, which is combination program machine learning techniques, extracted its usages, and(iii) identifying outlier technique achieving precision 97% detection. Our has further been applied millions free Android Google Play as 2019 2021. Highlights our measurements detected include their growing trend 2021 generators' significant contribution apps.

Language: Английский

Citations

4