STG2P: A two-stage pipeline model for intrusion detection based on improved LightGBM and K-means

Zhiqiang Zhang, Le Wang, Guangyao Chen, Zhaoquan Gu, Zhihong Tian, Xiaojiang Du, Mohsen Guizani

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

Network attack behavior is always mixed with a large number of normal communications, which makes the attack characteristics only account for a very small fraction in the log data. From the perspective of simulation and modeling, the data for attack detection is extremely unbalanced if we regard the attack behavior as the positive label. Network instruction detection is an important topic in identifying the attack behavior, but the detection methods based on simulation and model, such as traditional machine learning, face the challenges of poor effectiveness and efficiency. Supervised models, such as LightGBM, can effectively classify abnormal data because of the fast training speed and its high efficiency. However, it works badly when dealing with sparse negative data, such as the network intrusion data. On the other hand, unsupervised models, such as K-means, can achieve good performance with undesirable training time cost. However, it is difficult to select an appropriate parameter for network intrusion. In this paper, we propose a two-stage pipeline model named STG2P, which leverages the improved LightGBM and the reinforced K-means. Specifically, STG2P introduces a threshold for LightGBM in the coarse classification stage, and pipelines the draft results to K-means for filtering the false positive samples in the fine classification stage. By adaptively adopting the pipelined data of the improved LightGBM and K-means, the method can avoid the shortcomings of both models. We also conduct extensive simulations on the LANL dataset, and the results show that the AUC value can be improved as high as 29.48%. The detection rate of our method can reach 96.64%, which shows superior performance compared with some traditional detection methods.

Original languageEnglish
Article number102614
JournalSimulation Modelling Practice and Theory
Volume120
DOIs
StatePublished - Nov 2022

Keywords

  • Improved LightGBM
  • Intrusion detection
  • Pipeline model
  • Reinforced K-means

Fingerprint

Dive into the research topics of 'STG2P: A two-stage pipeline model for intrusion detection based on improved LightGBM and K-means'. Together they form a unique fingerprint.

Cite this