西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (1): 176-182.doi: 10.19665/j.issn1001-2400.2021.01.020

• • 上一篇    下一篇

自适应分箱特征选择的快速网络入侵检测系统

刘景美(),高源伯()   

  1. 西安电子科技大学 综合业务网理论及关键技术国家重点实验室,陕西 西安 710071
  • 收稿日期:2020-07-24 出版日期:2021-02-20 发布日期:2021-02-03
  • 通讯作者: 高源伯
  • 作者简介:刘景美(1979—),女,副教授,E-mail: jmliu@mail.xidian.edu.cn
  • 基金资助:
    装备预先研究项目(30604020102)

Fast network intrusion detection system using adaptive binning feature selection

LIU Jingmei(),GAO Yuanbo()   

  1. State Key Laboratory of Integrated Services Networks,Xidian University,Xi’an 710071,China
  • Received:2020-07-24 Online:2021-02-20 Published:2021-02-03
  • Contact: Yuanbo GAO

摘要:

针对传统入侵检测系统检测率较低、基于深度学习的入侵检测系统训练和检测时间较长的问题,提出基于信息增益的自适应分箱特征选择算法,并将此算法与LightGBM相结合,设计了一种快速网络入侵检测系统。首先对原始数据集进行预处理,将数据标准化;然后通过自适应分箱特征选择算法,去除原始数据中的冗余特征和噪声,将原始高维数据降为低维数据,从而提高系统的检测准确率并降低训练和检测时间;最后在经过特征选择的训练集上利用LightGBM进行模型训练,训练出能够检测攻击流量的入侵检测系统。通过在NSL-KDD数据集上验证,提出的特征选择算法在特征选择上仅耗时27.35 s,相比传统算法降低了约96.68%;设计的入侵检测系统在测试集上准确率高达93.32%,且训练时间较短。与现有网络入侵检测系统相比,准确率更高,模型训练速度更快。

关键词: 入侵检测, 特征选择, LightGBM算法, 信息增益, 集成学习

Abstract:

Aiming at the problems of the low detection rate of traditional intrusion detection systems and the long training and detection time of intrusion detection systems based on deep learning,an adaptive binning feature selection algorithm using the information gain is proposed,which is combined with LightGBM to design a fast network intrusion detection system.First,the original data set is preprocessed to standardize the data;then the redundant features and noise in the original data are removed through the adaptive binning feature selection algorithm,and the original high-dimensional data are reduced to the low-dimensional data,thereby improving the accuracy of the system and reducing the training and detection time;finally,LightGBM is used for model training on the training set selected by the characteristics to train an intrusion detection system that can detect attack traffic.Through verification on the NSL-KDD data set,the proposed feature selection algorithm only takes 27.35 seconds in feature selection,which is 96.68% lower than that by the traditional algorithm.The designed intrusion detection system has an accuracy rate of 93.32% on the test set,and its training time is low.Compared with the existing network intrusion detection system,the accuracy rate of the proposed system is higher,and its model training speed is faster.

Key words: intrusion detection, feature selection, LightGBM algorithm, information gain, ensemble learning

中图分类号: 

  • TP393