TY - GEN
T1 - Label-Less
T2 - 2019 IEEE Conference on Computer Communications, INFOCOM 2019
AU - Zhao, Nengwen
AU - Zhu, Jing
AU - Liu, Rong
AU - Liu, Dapeng
AU - Zhang, Ming
AU - Pei, Dan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/4
Y1 - 2019/4
N2 - KPI (Key Performance Indicator) anomaly detection is critical for Internet-based services to ensure the quality and reliability. However, existing algorithms' performance in reality is far from satisfying due to the lack of sufficient KPI anomaly data to help train and evaluate these algorithms. In this paper, we argue that labeling overhead is the main hurdle to obtain such datasets.Thus we novelly propose a semi-automatic labelling tool called Label-Less, which minimizes the labeling overhead in order to enable an ImageNet-like large-scale KPI anomaly dataset with high-quality ground truth. One novel technique in Label-Less is robust and rapid anomaly similarity search, which saves operators from scanning and checking the long KPIs back and forth for abnormal patterns or label consistency. In our evaluations using 30 real KPIs from a large Internet company, our anomaly similarity search achieves the best F-score of 0.95 on average, and a real-time per-KPI response time (less than 0.5 second). Overall, the feedback from deployment in practice shows that Label-Less can reduce operators' labeling overhead by more than 90%.
AB - KPI (Key Performance Indicator) anomaly detection is critical for Internet-based services to ensure the quality and reliability. However, existing algorithms' performance in reality is far from satisfying due to the lack of sufficient KPI anomaly data to help train and evaluate these algorithms. In this paper, we argue that labeling overhead is the main hurdle to obtain such datasets.Thus we novelly propose a semi-automatic labelling tool called Label-Less, which minimizes the labeling overhead in order to enable an ImageNet-like large-scale KPI anomaly dataset with high-quality ground truth. One novel technique in Label-Less is robust and rapid anomaly similarity search, which saves operators from scanning and checking the long KPIs back and forth for abnormal patterns or label consistency. In our evaluations using 30 real KPIs from a large Internet company, our anomaly similarity search achieves the best F-score of 0.95 on average, and a real-time per-KPI response time (less than 0.5 second). Overall, the feedback from deployment in practice shows that Label-Less can reduce operators' labeling overhead by more than 90%.
UR - http://www.scopus.com/inward/record.url?scp=85068234847&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068234847&partnerID=8YFLogxK
U2 - 10.1109/INFOCOM.2019.8737429
DO - 10.1109/INFOCOM.2019.8737429
M3 - Conference contribution
AN - SCOPUS:85068234847
T3 - Proceedings - IEEE INFOCOM
SP - 1882
EP - 1890
BT - INFOCOM 2019 - IEEE Conference on Computer Communications
Y2 - 29 April 2019 through 2 May 2019
ER -