西安电子科技大学学报

• 研究论文 • 上一篇    下一篇

一种融合信息熵的个人网络主题圈子发现算法

唐兴;权义宁;董泽;苗启广   

  1. (西安电子科技大学 计算机学院,陕西 西安 710071)
  • 收稿日期:2016-06-17 出版日期:2017-06-20 发布日期:2017-07-17
  • 作者简介:唐兴(1988-),男,西安电子科技大学博士研究生,E-mail:tangxing@stu.xidian.edu.cn
  • 基金资助:

    国家自然科学基金资助项目(61472302,61272280,U1404620,41271447);NSFC-广东联合基金(第二期)资助项目;模式识别国家重点实验室开放课题基金资助项目(201600031);教育部新世纪优秀人才支持计划资助项目(NCET-12-0919);中央高校基本科研业务费专项资金资助项目(K5051203020,JB150313,JB150317,K5051303018, BDY081422);陕西省自然科学基金资助项目(2010JM8027);西安市科技局资助项目(CXY1441(1));地理信息工程国家重点实验室开放研究基金资助项目(SKLGIE2014-M-4-4)

Novel algorithm for finding circles in the ego network based on entropy

TANG Xing;QUAN Yining;DONG Ze;MIAO Qiguang   

  1. (School of Computer Science and Technology, Xidian Univ., Xi'an 710071, China)
  • Received:2016-06-17 Online:2017-06-20 Published:2017-07-17

摘要:

个人网络由于规模小、信息量大的原因,成为社交网络分析中重要的研究对象,而现有的社区发现算法主要集中在全局大规模网络上,已有研究表明,全局网络的社区性质并不明显.文中提出了一种个人网络主题圈子发现算法,引入信息熵的概念衡量个人网络中用户圈子是否具有共同的主题,定义了新的目标函数,通过对目标函数进行启发式过程优化,实现了对用户个人网络主题圈子的挖掘和发现.并对微博文本进行主题提取,抽取出用户的主题兴趣,使用信息熵对用户主题的分布进行评估.然后,利用调和因子对结构性质函数与信息熵函数进行融合,给出了结合信息熵与结构模块性的目标函数.最后,对提出的目标函数进行近似,求得最优解,得到个人网络中的主题圈子.在新浪微博数据集上的实验结果表明,新算法能够有效地在个人网络上挖掘出具有文本高度聚合性的主题圈子,并且各个圈子在结构上具有高内聚低耦合的性质,对个人网络的分析和研究具有较大的应用意义.

关键词: 社交网络, 圈子发现, 文本挖掘, 个人网络, 信息熵

Abstract:

Due to small scale, large amounts of information, the ego network has become a very important research area. Present community detection algorithms focus mainly on the global large scale network, however existing researches have indicated that the community structure is not obvious as expected on the global network. In this paper a novel circles detection algorithm is proposed, which is devoted to finding the circle structure in the ego network. The proposed algorithm defines a new object function, and the detection of circles could be conducted via optimization of the function heuristically. First, this paper extracts topic distribution from the user generated text, and introduces information entropy to evaluate user topic distribution. Then, the harmonic factor is used to combine structure function and entropy function, which leads to the object function. Finally, the optimization of the object function gives the solution for circle detection. Extensive experiments on weibo dataset demonstrate that the proposed algorithm can effectively mine topic-related circles.

Key words: social network analysis, circle detection, text mining, ego network, entropy