西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (3): 170-187.doi: 10.19665/j.issn1001-2400.2021.03.022

• 网络空间安全 • 上一篇    下一篇

加密流量中的恶意流量识别技术

曾勇1(),吴正远1(),董丽华2(),刘志宏1(),马建峰1(),李赞2()   

  1. 1.西安电子科技大学 网络与信息安全学院,陕西 西安 710071
    2.西安电子科技大学 综合业务网理论及关键技术国家重点实验室,陕西 西安 710071
  • 收稿日期:2020-12-18 出版日期:2021-06-20 发布日期:2021-07-05
  • 作者简介:曾 勇(1978—),男,副教授,博士,E-mail:yzeng@mail.xidian.edu.cn|吴正远(1997—),男,西安电子科技大学硕士研究生,E-mail:18066746790@163.com|董丽华(1977—),女,副教授,博士,E-mail:lih_dong@mail.xidian.edu.cn|刘志宏(1968—),男,副教授,博士,E-mail:liuzhihong@mail.xidian.edu.cn|马建峰(1963—),男,教授,博士,E-mail:jfma@mail.xidian.edu.cn|李 赞(1975—),女,教授,博士,E-mail:zanli@xidian.edu.cn
  • 基金资助:
    国家自然科学基金(61941105)

Research on malicious traffic identification technology in encrypted traffic

ZENG Yong1(),WU Zhengyuan1(),DONG Lihua2(),LIU Zhihong1(),MA Jianfeng1(),LI Zan2()   

  1. 1. School of Cyber Engineering,Xidian University,Xi’an 710071,China
    2. State Key Laboratory of Integrated Service Networks,Xidian University,Xi’an 710071,China
  • Received:2020-12-18 Online:2021-06-20 Published:2021-07-05

摘要:

网络流量的加密传输是互联网的发展趋势之一,而加密流量中的恶意流量识别是维护网络空间安全的重要手段。识别恶意流量需要将加密流量进行密/非密、应用程序以及加密算法的细粒度区分以提高识别效率,再将不同精细度区分后的流量经过预处理后转化为图像、矩阵和N-gram等形式导入机器学习训练模型中进行训练,实现良性/恶意流量的二分类以及多分类。基于机器学习的识别效果严重依赖于样本数量和质量,同时无法有效地应对整形和混淆后的流量,而基于密码学的恶意流量识别技术通过深度融合可搜索加密技术、流量审查机制和可证明安全模型,在加密流量上检索恶意关键词以避免样本数目不足和流量整形的问题,同时实现对数据和规则的隐私保护。对加密流量中的恶意流量识别所涉及到的上述技术进行了总结,指出存在的问题并展望未来发展的方向。

关键词: 加密流量, 恶意流量, 机器学习, 密码学

Abstract:

The encrypted transmission of network traffic is one of the development trends of the Internet.The identification of malicious traffic in encrypted traffic is an important way to maintain the security of cyberspace.One of the prior tasks of identifying malicious traffic is to classify encrypted traffic into the encrypted/unencrypted,different kinds of the application programs and encryption algorithms in order to improve the efficiency of identification.Then they are transformed into the image,matrix,n-gram or other forms which will be sent into the machine learning training model,so as to realize the binary classification and multi classification of benign malicious traffic.However,the machine learning based way relies seriously on the number and quality of samples,and can not effectively deal with the data after traffic shaping or confusion.Fortunately,cryptography based malicious traffic identification can search malicious keywords over encrypted traffic to avoid such problems,which must integrate searchable encryption technology,deep packet inspection and a provable security model to protect both data and rules.Finally,some unsolved problems of malicious traffic identification technology in encrypted traffic are presented.

Key words: encrypted traffic, malicious traffic, machine learning, cryptography

中图分类号: 

  • TP393