Automatically identifying performance issue reports with heuristic linguistic patterns

Yutong Zhao, Lu Xiao, Pouria Babvey, Lei Sun, Sunny Wong, Angel A. Martinez, Xiao Wang

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    5 Scopus citations

    Abstract

    Performance issues compromise the response time and resource consumption of a software system. Modern software systems use issue tracking systems to manage all kinds of issue reports, including performance issues. The problem is that performance issues are often not explicitly tagged. The tagging mechanism, if exists, is completely voluntary, depending on the project's convention and on submitters' discipline. For example, the performance tag rate in Apache's Jira system is below 1%. This paper contributes a hybrid classification approach that combines linguistic patterns and machine/deep learning techniques to automatically detect performance issue reports. We manually analyzed 980 real-life performance issue reports and derived 80 project-agnostic linguistic patterns that recur in the reports. Our approach uses these linguistic patterns to construct the sentence-level and issue-level learning features for training effective machine/deep learning classifiers. We test our approach on two separate datasets, each consisting of 980 unclassified issue reports, and compare the results with 31 baseline methods. Our approach can reach up to 83% precision and up to 59% recall. The only comparable baseline method is BERT, which is still 25% lower in the F1-score.

    Original languageEnglish
    Title of host publicationESEC/FSE 2020 - Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
    EditorsPrem Devanbu, Myra Cohen, Thomas Zimmermann
    Pages964-975
    Number of pages12
    ISBN (Electronic)9781450370431
    DOIs
    StatePublished - 8 Nov 2020
    Event28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020 - Virtual, Online, United States
    Duration: 8 Nov 202013 Nov 2020

    Publication series

    NameESEC/FSE 2020 - Proceedings of the 28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering

    Conference

    Conference28th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020
    Country/TerritoryUnited States
    CityVirtual, Online
    Period8/11/2013/11/20

    Keywords

    • Performance optimization
    • Software performance
    • Software repositories mining

    Fingerprint

    Dive into the research topics of 'Automatically identifying performance issue reports with heuristic linguistic patterns'. Together they form a unique fingerprint.

    Cite this