Discovering API usage specifications for security detection using two-stage code mining DOI Creative Commons

Zhongxu Yin,

Yi-Ran Song,

Guoxiao Zong

et al.

Cybersecurity, Journal Year: 2024, Volume and Issue: 7(1)

Published: Oct. 3, 2024

Abstract An application programming interface (API) usage specification, which includes the conditions, calling sequences, and semantic relationships of API, is important for verifying its correct usage, in turn critical ensuring security availability target program. However, existing techniques either mine co-occurring multiple APIs without considering their relationships, or they use data flow control information to extract beliefs on API pairs but difficult incorporate when mining specifications APIs. Hence, we propose an specification approach that efficiently extracts a relatively complete list combinations between This analyzes program two stages. The first stage uses frequent set based common identification filtration maximal context-sensitive sequences. In second stage, relationship graph constructed using three extracted from symbolic path information, containing are mined. experimental results six popular open-source code bases different scales show proposed two-stage not only yields better than typical approaches, also can effectively discover along with Instance analysis shows security-related call violations assist cause patch software vulnerabilities.

Language: Английский

Seal: Towards Diverse Specification Inference for Linux Interfaces from Security Patches DOI
Wei Chen, Bowen Zhang, Chengpeng Wang

et al.

Published: March 26, 2025

Language: Английский

Citations

0

Let’s Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference DOI
Qing Huang,

Yanbang Sun,

Zhenchang Xing

et al.

ACM Transactions on Software Engineering and Methodology, Journal Year: 2024, Volume and Issue: unknown

Published: July 23, 2024

APIs have intricate relations that can be described in text and represented as knowledge graphs to aid software engineering tasks. Existing relation extraction methods limitations, such limited API corpus affected by the characteristics of input text. To address these we propose utilizing large language models (LLMs) (e.g., gpt-3.5) a neural base for inference. This approach leverages entire Web used pre-train LLMs is insensitive context complexity texts. ensure accurate inference, design an AI chain consisting three modules: Fully Qualified Name (FQN) Parser, Knowledge Extractor, Relation Decider. The accuracy FQN Parser Decider 0.81 0.83, respectively. Using generative capacity LLM our approach’s inference capability, achieve average F1 value 0.76 under datasets, significantly higher than state-of-the-art method’s 0.40. Compared original CoT modularized methods, has improved performance 71% 49%, Meanwhile, prompt ensembling strategy enhances 32%. inferred method further organized into structured forms provide support other

Language: Английский

Citations

1

Leveraging Large Language Model to Assist Detecting Rust Code Comment Inconsistency DOI
Yichi Zhang, Zixi Liu, Yang Feng

et al.

Published: Oct. 18, 2024

Language: Английский

Citations

1

Adaptoring: Adapter Generation to Provide an Alternative API for a Library DOI
Lars M Reimann,

Günter Kniesel-Wünsche

2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Journal Year: 2024, Volume and Issue: unknown, P. 192 - 203

Published: March 12, 2024

Language: Английский

Citations

0

Discovering API usage specifications for security detection using two-stage code mining DOI Creative Commons

Zhongxu Yin,

Yi-Ran Song,

Guoxiao Zong

et al.

Cybersecurity, Journal Year: 2024, Volume and Issue: 7(1)

Published: Oct. 3, 2024

Abstract An application programming interface (API) usage specification, which includes the conditions, calling sequences, and semantic relationships of API, is important for verifying its correct usage, in turn critical ensuring security availability target program. However, existing techniques either mine co-occurring multiple APIs without considering their relationships, or they use data flow control information to extract beliefs on API pairs but difficult incorporate when mining specifications APIs. Hence, we propose an specification approach that efficiently extracts a relatively complete list combinations between This analyzes program two stages. The first stage uses frequent set based common identification filtration maximal context-sensitive sequences. In second stage, relationship graph constructed using three extracted from symbolic path information, containing are mined. experimental results six popular open-source code bases different scales show proposed two-stage not only yields better than typical approaches, also can effectively discover along with Instance analysis shows security-related call violations assist cause patch software vulnerabilities.

Language: Английский

Citations

0