AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE

Eman Abdullah Alomar, Anton Ivanov, Zarina Kurbatova, Yaroslav Golubev, Mohamed Wiem Mkaouer, Ali Ouni, Timofey Bryksin, Le Nguyen, Amit Kini, Aditya Thakur

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

We developed a plugin for IntelliJ IDEA called AntiCopyPaster, which tracks the pasting of code fragments inside the IDE and suggests the appropriate Extract Method refactoring to combat the propagation of duplicates. Unlike the existing approaches, our tool is integrated with the developer's workflow, and pro-actively recommends refactorings. Since not all code fragments need to be extracted, we develop a classification model to make this decision. When a developer copies and pastes a code fragment, the plugin searches for duplicates in the currently opened file, waits for a short period of time to allow the developer to edit the code, and finally inferences the refactoring decision based on a number of features. Our experimental study on a large dataset of 18,942 code fragments mined from 13 Apache projects shows that AntiCopyPaster correctly recommends Extract Method refactorings with an F-score of 0.82. Furthermore, our survey of 59 developers reflects their satisfaction with the developed plugin's operation. The plugin and its source code are publicly available on GitHub at https://github.com/JetBrains-Research/anti-copy-paster. The demonstration video can be found on YouTube: https://youtu.be/-wwHg-qFjJY.

Original languageEnglish
Title of host publication37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
EditorsMario Aehnelt, Thomas Kirste
ISBN (Electronic)9781450396240
DOIs
StatePublished - 19 Sep 2022
Event37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022 - Rochester, United States
Duration: 10 Oct 202214 Oct 2022

Publication series

NameACM International Conference Proceeding Series

Conference

Conference37th IEEE/ACM International Conference on Automated Software Engineering, ASE 2022
Country/TerritoryUnited States
CityRochester
Period10/10/2214/10/22

Keywords

  • machine learning
  • refactoring
  • software quality

Fingerprint

Dive into the research topics of 'AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE'. Together they form a unique fingerprint.

Cite this