Intelligent Code Review Assignment for Large Scale Open Source Software Stacks

Ishan Aryendu, Ying Wang, Farah Elkourdi, Eman Abdullah Alomar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In the process of developing software, code review is crucial. By identifying problems before they arise in production, it enhances the quality of the code. Finding the best reviewer for a code change, however, is extremely challenging especially in large scale, especially open source software stacks with cross functioning designs and collaborations among multiple developers and teams. Additionally, a review by someone who lacks knowledge and understanding of the code can result in high resource consumption and technical errors. The reviewers who have the specialty in both functioning (domain knowledge) and non-functioning areas of a commit are considered as the most qualified reviewer to look over any changes to the code. Quality attributes serve as the connection among the user requirements, delivered function description, software architecture and implementation through put the entire software stack cycle. In this study, we target on auto reviewer assignment in large scale software stacks and aim to build a self-learning, and self-correct platform for intelligently matching between a commit based on its quality attributes and the skills sets of reviewers. To achieve this, quality attributes are classified and abstracted from the commit messages and based on which, the commits are assigned to the reviewers with the capability in reviewing the target commits. We first designed machine learning schemes for abstracting quality attributes based on historical data from the OpenStack repository. Two models are built and trained for automating the classification of the commits based on their quality attributes using the manual labeling of commits and multi-class classifiers. We then positioned the reviewers based on their historical data and the quality attributes characteristics. Finally we selected the recommended reviewer based on the distance between a commit and candidate reviewers. In this paper, we demonstrate how the models can choose the best quality attributes and assign the code review to the most qualified reviewers. With a comparatively small training dataset, the models are able to achieve F-1 scores of 77% and 85.31%, respectively.

Original languageEnglish
Title of host publication37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
EditorsMario Aehnelt, Thomas Kirste
ISBN (Electronic)9781450396240
DOIs
StatePublished - 19 Sep 2022
Event37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022 - Rochester, United States
Duration: 10 Oct 202214 Oct 2022

Publication series

NameACM International Conference Proceeding Series

Conference

Conference37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
Country/TerritoryUnited States
CityRochester
Period10/10/2214/10/22

Keywords

  • Code Review
  • Commit Classification
  • Large-scale
  • MPNet
  • Machine Learning
  • Open-source

Fingerprint

Dive into the research topics of 'Intelligent Code Review Assignment for Large Scale Open Source Software Stacks'. Together they form a unique fingerprint.

Cite this