Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC)

Mahi Begoug, Moataz Chouchen, Ali Ouni, Eman Abdullah Alomar, Mohamed Wiem Mkaouer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Infrastructure-as-Code (IaC) is an emerging software engineering practice that leverages source code to facilitate automated configuration of software systems' infrastructure. IaC files are typically complex, containing hundreds of lines of code and dependencies, making them prone to defects, which can result in breaking online services at scale. To help developers early identify and fix IaC defects, research efforts have introduced IaC defect prediction models at the file level. However, the granularity of the proposed approaches remains coarse-grained, requiring developers to inspect hundreds of lines of code in a file, while only a small fragment of code is defective. To alleviate this issue, we introduce a machinelearning-based approach to predict IaC defects at a fine-grained level, focusing on IaC blocks, i.e., small code units that encapsulate specific behaviours within an IaC file. We trained various machine learning algorithms based on a mixture of code, process, and change-level metrics. We evaluated our approach on 19 open-source projects that use Terraform, a widely used IaC tool. The results indicated that there is no single algorithm that consistently outperforms the others in 19 projects. Overall, among the six algorithms, we observed that the LightGBM model achieved a higher average of 0.21 in terms of MCC and 0.71 in terms of AUC. Models analysis reveals that the developer's experience and the relative number of added lines tend to be the most important features. Additionally, we found that blocks belonging to the most frequent types are more prone to defects. Our defect prediction models have also shown sensitivity to concept drift, indicating that IaC practitioners should regularly retrain their models.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/ACM 21st International Conference on Mining Software Repositories, MSR 2024
Pages100-112
Number of pages13
ISBN (Electronic)9798400705878
DOIs
StatePublished - 2024
Event21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024 - Lisbon, Portugal
Duration: 15 Apr 202416 Apr 2024

Publication series

NameProceedings - 2024 IEEE/ACM 21st International Conference on Mining Software Repositories, MSR 2024

Conference

Conference21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024
Country/TerritoryPortugal
CityLisbon
Period15/04/2416/04/24

Keywords

  • Defect Prediction
  • IaC
  • Infrastructure-as-Code
  • Terraform

Fingerprint

Dive into the research topics of 'Fine-Grained Just-In-Time Defect Prediction at the Block Level in Infrastructure-as-Code (IaC)'. Together they form a unique fingerprint.

Cite this