A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic Patterns

Yutong Zhao, Lu Xiao, Sunny Wong

Research output: Contribution to journalArticlepeer-review

Abstract

Software performance is critical for system efficiency, with performance issues potentially resulting in budget overruns, project delays, and market losses. Such problems are reported to developers through issue tracking systems, which are often under-tagged, as the manual tagging process is voluntary and time-consuming. Existing automated performance issue tagging techniques, such as keyword matching and machine/deep learning models, struggle due to imbalanced datasets and a high degree of variance. This paper presents a novel hybrid classification approach, combining Heuristic Linguistic Patterns (HLPs) with machine/deep learning models to enable practitioners to automatically identify performance-related issues. The proposed approach works across three progressive levels: HLP tagging, sentence tagging, and issue tagging, with a focus on linguistic analysis of issue descriptions. The authors evaluate the approach on three different datasets collected from different projects and issue-tracking platforms to prove that the proposed framework is accurate, project-and platform-agnostic, and robust to imbalanced datasets. Furthermore, this study also examined how the two unique techniques of the framework, including the fuzzy HLP matching and the Issue HLP Matrix, contribute to the accuracy. Finally, the study explored the effectiveness and impact of two off-the-shelf feature selection techniques, Boruta and RFE, with the proposed framework. The results showed that the proposed framework has great potential for practitioners to accurately (with up to 100% precision, 66% recall, and 79% F1-score) identify performance issues, with robustness to imbalanced data and good transferability to new projects and issue tracking platforms.

Original languageEnglish
Pages (from-to)1704-1725
Number of pages22
JournalIEEE Transactions on Software Engineering
Volume50
Issue number7
DOIs
StatePublished - 2024

Keywords

  • automatic text classification
  • linguistic pattern
  • Software performance
  • software repository mining

Fingerprint

Dive into the research topics of 'A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic Patterns'. Together they form a unique fingerprint.

Cite this