西安电子科技大学学报 ›› 2023, Vol. 50 ›› Issue (4): 206-214.doi: 10.19665/j.issn1001-2400.2023.04.020

• 网络空间安全专栏 • 上一篇    下一篇

一种利用注意力增强卷积的暗网用户对齐方法

杨燕燕1(),杜彦辉1(),刘洪梦2(),赵佳鹏2(),时金桥2(),王学宾3()   

  1. 1.中国人民公安大学,信息网络安全学院,北京 100038
    2.北京邮电大学,网络空间安全学院,北京 100876
    3.中国科学院 信息工程研究所,北京 100080
  • 收稿日期:2023-01-21 出版日期:2023-08-20 发布日期:2023-10-17
  • 作者简介:杨燕燕(1986—),女,中国人民公安大学硕士研究生,E-mail:53996587@qq.com;|杜彦辉(1969—),男,教授,duyanhui@ppsuc.edu.cn;|刘洪梦(1999—),男,北京邮电大学硕士研究生,E-mail:cs2lhm@bupt.edu.cn;|时金桥(1978—),男,教授,E-mail:shijinqiao@bupt.edu.cn;|王学宾(1987—),男,高级工程师,E-mail:wangxuebin@iie.ac.cn
  • 基金资助:
    国家重点研发计划(2021YFB3100600)

Dark web author alignment based on attention augmented convolutional networks

YANG Yanyan1(),DU Yanhui1(),LIU Hongmeng2(),ZHAO Jiapeng2(),SHI Jinqiao2(),WANG Xuebin3()   

  1. 1. Department of Information Technology and Cyber Security,People’s Public Security University of China,Beijing 100038,China
    2. School of Cyber Space Security,Beijing University of Posts and Telecommunications,Beijing 100876,China
    3. Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100080,China
  • Received:2023-01-21 Online:2023-08-20 Published:2023-10-17

摘要:

暗网用户在地下市场从事大量违法犯罪活动,暗网的匿名性给暗网用户之间的沟通交流带来了极大的便利,但也给执法人员带来了极大困难。近年来,深度神经网络在各个领域取得广泛成功,越来越多的研究者开始利用神经网络对匿名的网络文本作者进行身份识别。为了更好地进行暗网用户对齐,寻找更多同一身份的不同用户,笔者借用神经网络方法进行暗网用户身份识别和对齐。然而已有的方法主要面向短文本,不擅长处理全局和长序列信息。文中提出了一种自注意机制来增强卷积算子,利用长序列信息来建模暗网用户发表的网络文本的方法,从文本内容入手,对匿名的暗网用户进行多账号关联,达到聚合多个匿名账号信息的目的,为获取用户的真实身份提供更多线索。笔者在两个不同的暗网市场论坛进行全面评估,将提出的方法与当前最先进的技术进行了比较。结果表明提出的方法非常有效,在两个公开数据集上平均检索排名(MRR)分别提高约2.9%和3.6%,Recall@10分别提高约2.3%和3.0%。这项评估为该方法在暗网市场论坛中的有效性提供了强有力的证据。

关键词: 文本嵌入, 注意力机制, 卷积算子, 长序列信息

Abstract:

Dark network users engage in a large number of illegal and criminal activities in the underground market.The anonymity of the dark network brings great convenience to the communication between users of the dark network,but great difficulties to the police.In recent years,the deep neural network has been widely successful in various fields,and more and more researchers have begun to use the neural network to identify anonymous network text authors.In order to better align users in the dark web and find more different users with the same identity,we use the neural network method to identify and align users in the dark web.However,the existing methods focus mainly on the short text and are not good at dealing with the global and long sequence information.In this paper,we propose a self-attention mechanism to enhance the convolution operator and use long sequence information to strengthen the user representation,named DACN.DACN starts from the text content,and multiple account associations are carried out for anonymous dark web users to aggregate information from multiple anonymous accounts,proving mores clues for obtaining the users’true identity.Our recent analysis involves conducting a thorough assessment of two distinct dark web market forums,whereby we evaluate our methodology in comparison to the current state-of-the-art techniques.Experimental results show that our approach is remarkably effective,with a demonstrated average mean retrieval ranking (MRR) enhancement of 2.9% and 3.6%,as well as an improved Recall@10 of 2.3% and 3.0%.This evaluation offers robust evidence of the efficacy of our approach in dark web market forums.

Key words: text embedding, attention mechanism, convolutional networks, long sequence information

中图分类号: 

  • TP18