TY - JOUR
T1 - Just-in-time code duplicates extraction
AU - AlOmar, Eman Abdullah
AU - Ivanov, Anton
AU - Kurbatova, Zarina
AU - Golubev, Yaroslav
AU - Mkaouer, Mohamed Wiem
AU - Ouni, Ali
AU - Bryksin, Timofey
AU - Nguyen, Le
AU - Kini, Amit
AU - Thakur, Aditya
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2023/6
Y1 - 2023/6
N2 - Context: Refactoring is a critical task in software maintenance, and is usually performed to enforce better design and coding practices, while coping with design defects. The Extract Method refactoring is widely used for merging duplicate code fragments into a single new method. Several studies attempted to recommend Extract Method refactoring opportunities using different techniques, including program slicing, program dependency graph analysis, change history analysis, structural similarity, and feature extraction. However, irrespective of the method, most of the existing approaches interfere with the developer's workflow: they require the developer to stop coding and analyze the suggested opportunities, and also consider all refactoring suggestions in the entire project without focusing on the development context. Objective: To increase the adoption of the Extract Method refactoring, in this paper, we aim to investigate the effectiveness of machine learning and deep learning algorithms for its recommendation while maintaining the workflow of the developer. Method: The proposed approach relies on mining prior applied Extract Method refactorings and extracting their features to train a deep learning classifier that detects them in the user's code. We implemented our approach as a plugin for IntelliJ IDEA called ANTICOPYPASTER. To develop our approach, we trained and evaluated various popular models on a dataset of 18,942 code fragments from 13 Open Source Apache projects. Results: The results show that the best model is the Convolutional Neural Network (CNN), which recommends appropriate Extract Method refactorings with an F-measure of 0.82. We also conducted a qualitative study with 72 developers to evaluate the usefulness of the developed plugin. Conclusion: The results show that developers tend to appreciate the idea of the approach and are satisfied with various aspects of the plugin's operation.
AB - Context: Refactoring is a critical task in software maintenance, and is usually performed to enforce better design and coding practices, while coping with design defects. The Extract Method refactoring is widely used for merging duplicate code fragments into a single new method. Several studies attempted to recommend Extract Method refactoring opportunities using different techniques, including program slicing, program dependency graph analysis, change history analysis, structural similarity, and feature extraction. However, irrespective of the method, most of the existing approaches interfere with the developer's workflow: they require the developer to stop coding and analyze the suggested opportunities, and also consider all refactoring suggestions in the entire project without focusing on the development context. Objective: To increase the adoption of the Extract Method refactoring, in this paper, we aim to investigate the effectiveness of machine learning and deep learning algorithms for its recommendation while maintaining the workflow of the developer. Method: The proposed approach relies on mining prior applied Extract Method refactorings and extracting their features to train a deep learning classifier that detects them in the user's code. We implemented our approach as a plugin for IntelliJ IDEA called ANTICOPYPASTER. To develop our approach, we trained and evaluated various popular models on a dataset of 18,942 code fragments from 13 Open Source Apache projects. Results: The results show that the best model is the Convolutional Neural Network (CNN), which recommends appropriate Extract Method refactorings with an F-measure of 0.82. We also conducted a qualitative study with 72 developers to evaluate the usefulness of the developed plugin. Conclusion: The results show that developers tend to appreciate the idea of the approach and are satisfied with various aspects of the plugin's operation.
KW - Machine learning
KW - Refactoring
KW - Software quality
UR - http://www.scopus.com/inward/record.url?scp=85147970363&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147970363&partnerID=8YFLogxK
U2 - 10.1016/j.infsof.2023.107169
DO - 10.1016/j.infsof.2023.107169
M3 - Article
AN - SCOPUS:85147970363
SN - 0950-5849
VL - 158
JO - Information and Software Technology
JF - Information and Software Technology
M1 - 107169
ER -