Assessing Architecture Conformance to Security-Related Practices in Infrastructure as Code Based Deployments DOI

Evangelos Ntentos,

Uwe Zdun, Ghareeb Falazi

et al.

Published: July 1, 2022

Infrastructure as Code (IaC) enables developers and operations teams to automatically deploy manage an IT infrastructure via software. Among other uses, IaC is widely used in the context of continuously released deployments such those microservice cloud-based systems. Although IaC-based have been utilized by many companies, there are no approaches on checking their conformance architectural aspects yet. In this paper, we focus security-related practices including observability, access control, traffic control deployments. While best for topic documented some gray literature sources practitioners' blogs public repositories, enabling automated do not yet exist. We propose a model-based approach based generic, technology-independent metrics, tied typical design decisions With approach, can measure practices. demonstrate assess validity appropriateness these metrics assessing system's through regression analysis.

Language: Английский

The do’s and don’ts of infrastructure code: A systematic gray literature review DOI Creative Commons
Indika Kumara, Martín Garriga,

Angel Urbano Romeu

et al.

Information and Software Technology, Journal Year: 2021, Volume and Issue: 137, P. 106593 - 106593

Published: April 29, 2021

Infrastructure-as-code (IaC) is the DevOps tactic of managing and provisioning software infrastructures through machine-readable definition files, rather than manual hardware configuration or interactive tools. From a maintenance evolution perspective, topic has picked interest practitioners academics alike, given relative scarcity supporting patterns practices in academic literature. At same time, considerable amount gray literature exists on IaC. Thus we aim to characterize IaC compile catalog best bad for widely used languages, all using materials. In this paper, systematically analyze industrial IaC, such as blog posts, tutorials, white papers qualitative analysis techniques. We proposed distilled broad summarized taxonomy consisting 10 4 primary categories practices, respectively, both language-agnostic language-specific ones, three namely Ansible, Puppet, Chef. The reflect implementation issues, design violation of/adherence essential principles Our findings reveal critical insights concerning top languages well adopted by address (some of) those challenges. evidence that field development its infancy deserves further attention.

Language: Английский

Citations

38

Smelly variables in ansible infrastructure code DOI Open Access
Ruben Opdebeeck, Ahmed Zerouali, Coen De Roover

et al.

Published: May 23, 2022

Infrastructure as Code is the practice of automating provisioning, configuration, and orchestration network nodes using code in which variable values such configuration parameters, node hostnames, etc. play a central role. Mistakes these are an important cause infrastructure defects corresponding outages. Ansible, popular IaC language, nonetheless features semantics can confusion about value variables.

Language: Английский

Citations

23

Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort? DOI
Ruben Opdebeeck, Ahmed Zerouali, Coen De Roover

et al.

Published: May 1, 2023

Infrastructure as Code is the practice of developing and maintaining computing infrastructure through executable source code. Unfortunately, IaC has also brought about new cyber attack vectors. Prior work therefore proposed static analyses that detect security smells in files. However, they have so far remained at a shallow level, disregarding control data flow scripts under analysis, may lack awareness specific syntactic constructs. These limitations inhibit quality their results. To address these limitations, this paper, we present GASEL, novel smell detector for Ansible language. It uses graph queries on program dependence graphs to 7 smells. Our evaluation an oracle 243 real-world comparison against two state-of-the-art detectors shows syntax, flow, enables our approach substantially improve both precision recall. We further question whether additional effort required develop run such justified practice. end, investigate prevalence indirection across more than 15 000 scripts. find over 55% contain data-flow indirection, 32% require whole-project analysis detect. findings motivate need deeper tools vulnerabilities IaC.

Language: Английский

Citations

13

GLITCH: Automated Polyglot Security Smell Detection in Infrastructure as Code DOI Open Access
Nuno Saavedra, João F. Ferreira

Published: Oct. 10, 2022

Infrastructure as Code (IaC) is the process of managing IT infrastructure via programmable configuration files (also called IaC scripts). Like other software artifacts, scripts may contain security smells, which are coding patterns that can result in weaknesses. Automated analysis tools to detect smells exist, but they focus on specific technologies such Puppet, Ansible, or Chef. This means when detection a new smell implemented one tools, it not immediately available for supported by — only option duplicate effort.

Language: Английский

Citations

14

State Reconciliation Defects in Infrastructure as Code DOI
Md Mahadi Hassan, John Salvador, Shubhra Kanti Karmaker

et al.

Proceedings of the ACM on software engineering., Journal Year: 2024, Volume and Issue: 1(FSE), P. 1865 - 1888

Published: July 12, 2024

In infrastructure as code (IaC), state reconciliation is the process of querying and comparing prior to changing infrastructure. As pivotal manage IaC-based computing at scale, defects related can create large-scale consequences. A categorization defects, i.e., reconciliation, aid in understanding nature defects. We conduct an empirical study with 5,110 where we apply qualitative analysis categorize From identified defect categories, derive heuristics design prompts for a large language model (LLM), which turn are used validation reconciliation. our study, identify 8 categories amongst 3 have not been reported previously-studied software systems. The most frequently occurring category inventory, that occur when managing inventory. Using LLM heuristics-based paragraph style prompts, 9 previously unknown 7 accepted valid 4 already fixed. Based on findings, conclude paper by providing set recommendations researchers practitioners.

Language: Английский

Citations

2

FindICI: Using machine learning to detect linguistic inconsistencies between code and natural language descriptions in infrastructure-as-code DOI Creative Commons
Nemania Borovits, Indika Kumara, Dario Di Nucci

et al.

Empirical Software Engineering, Journal Year: 2022, Volume and Issue: 27(7)

Published: Sept. 20, 2022

Linguistic anti-patterns are recurring poor practices concerning inconsistencies in the naming, documentation, and implementation of an entity. They impede readability, understandability, maintainability source code. This paper attempts to detect linguistic Infrastructure-as-Code (IaC) scripts used provision manage computing environments. In particular, we consider between logic/body IaC code units their short text names. To this end, propose FindICI a novel automated approach that employs word embedding classification algorithms. We build use abstract syntax tree create embeddings by machine learning techniques inconsistent units. evaluated our with two experiments on Ansible tasks systematically extracted from open repositories for various models Classical deep different methods showed comparable satisfactory results detecting related top-10 modules.

Language: Английский

Citations

10

Automatically detecting risky scripts in infrastructure code DOI
Ting Dai, Alexei Karve,

Grzegorz Koper

et al.

Published: Oct. 12, 2020

Infrastructure code supports embedded scripting languages such as Shell and PowerShell to manage the infrastructure resources conduct life-cycle operations. Risky patterns in scripts have widespread of negative impacts across whole infrastructure, causing disastrous consequences. In this paper, we propose an analysis framework, which can automatically extract compose from before detecting their risky with correlated severity levels impacts. We implement SecureCode based on proposed framework check supported by Ansible, i.e., Ansible playbooks. integrate DevOp pipeline deployed IBM cloud test Secure-Code 45 Services community repositories. Our evaluation shows that efficiently effectively identify 3419 true issues 116 false positives minutes. Among issues, 1691 high levels.

Language: Английский

Citations

15

Andromeda: A Dataset of Ansible Galaxy Roles and Their Evolution DOI
Ruben Opdebeeck, Ahmed Zerouali, Coen De Roover

et al.

Published: May 1, 2021

Cloud-native applications increasingly provision infrastructure resources programmatically through Infrastructure as Code (IaC) scripts. These scripts have in turn become the subject of empirical software engineering research. However, an often-overlooked part are ecosystems that grown around IaC languages. For example, Galaxy is ecosystem for popular Ansible language. features a large number so-called "roles", which reusable collections code akin to libraries general-purpose In contrast to, and despite their similarities, such enjoyed far less attention literature than library this data showcase paper, we present Andromeda, first dataset capturing ecosystem, its roles, evolution. Andromeda provides structural representations more 125 000 role versions, upwards 800 concrete changes between versions extracted from underlying git repositories. aims provide extensive view contributor side hope will stimulate additional research on ecosystems.

Language: Английский

Citations

13

Finding broken Linux configuration specifications by statically analyzing the Kconfig language DOI
Jeho Oh, Necip Fazıl Yıldıran,

Julian Braha

et al.

Published: Aug. 18, 2021

Highly-configurable software underpins much of our computing infrastructure. It enables extensive reuse, but opens the door to broken configuration specifications. The specification language, Kconfig, is designed prevent invalid configurations Linux kernel from being built. However, astronomical size space for makes finding bugs difficult by hand or with random testing. In this paper, we introduce a model checking framework building Kconfig static analysis tools. We develop formal semantics language and implement in symbolic evaluator called kclause that models behavior as logical formulas. then design bug finder, kismet, takes leverages automated theorem proving find unmet dependency bugs. kismet evaluated its precision, performance, impact on development recent version Linux, which has over 140,000 lines across 28 architecture-specific Our evaluation finds 781 (151 when considering sharing among specifications) 100% spending between 37 90 minutes each specification, although it misses some due underapproximation. Compared testing, substantially more true positive fraction time.

Language: Английский

Citations

11

Polyglot Code Smell Detection for Infrastructure as Code with GLITCH DOI
Nuno Saavedra, João Gonçalves,

Miguel Henriques

et al.

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Journal Year: 2023, Volume and Issue: unknown, P. 2042 - 2045

Published: Sept. 11, 2023

This paper presents GLITCH, a new technology-agnostic framework that enables automated polyglot code smell detection for Infrastructure as Code scripts. GLITCH uses an intermediate representation on which different detectors can be defined. It currently supports the of nine security smells and design & implementation in scripts written Ansible, Chef, Docker, Puppet, or Terraform. Studies conducted with not only show reduce effort writing analyses multiple IaC technologies, but also it has higher precision recall than current state-of-the-art tools. A video describing demonstrating is available at: https://youtu.be/E4RhCcZjWbk.

Language: Английский

Citations

4