TY - JOUR
T1 - Comparing commit messages and source code metrics for the prediction refactoring activities
AU - Sagar, Priyadarshni Suresh
AU - Alomar, Eman Abdulah
AU - Mkaouer, Mohamed Wiem
AU - Ouni, Ali
AU - Newman, Christian D.
N1 - Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/10
Y1 - 2021/10
N2 - Understanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction of refactoring operation types as a multi-class classification problem. Our solution relies on measuring metrics extracted from committed code changes in order to extract the corresponding features (i.e., metric variations) that better represent each class (i.e., refactoring type) in order to automatically predict, for a given commit, the method-level type of refactoring being applied, namely Move Method, Rename Method, Extract Method, Inline Method, Pull-up Method, and Push-down Method. We compared various classifiers, in terms of their prediction performance, using a dataset of 5004 commits and extracted 800 Java projects. Our main findings show that the random forest model trained with code metrics resulted in the best average accuracy of 75%. However, we detected a variation in the results per class, which means that some refactoring types are harder to detect than others.
AB - Understanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction of refactoring operation types as a multi-class classification problem. Our solution relies on measuring metrics extracted from committed code changes in order to extract the corresponding features (i.e., metric variations) that better represent each class (i.e., refactoring type) in order to automatically predict, for a given commit, the method-level type of refactoring being applied, namely Move Method, Rename Method, Extract Method, Inline Method, Pull-up Method, and Push-down Method. We compared various classifiers, in terms of their prediction performance, using a dataset of 5004 commits and extracted 800 Java projects. Our main findings show that the random forest model trained with code metrics resulted in the best average accuracy of 75%. However, we detected a variation in the results per class, which means that some refactoring types are harder to detect than others.
KW - Commits
KW - Refactoring
KW - Software engineering
KW - Software metrics
KW - Software quality
UR - http://www.scopus.com/inward/record.url?scp=85116677802&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116677802&partnerID=8YFLogxK
U2 - 10.3390/a14100289
DO - 10.3390/a14100289
M3 - Article
AN - SCOPUS:85116677802
VL - 14
JO - Algorithms
JF - Algorithms
IS - 10
M1 - 289
ER -