电子科技 ›› 2021, Vol. 34 ›› Issue (5): 35-41.doi: 10.16180/j.cnki.issn1007-7820.2021.05.007

• • 上一篇    下一篇

基于神经网络的文本标题生成原型系统设计

张仕森,孙宪坤,尹玲,李世玺   

  1. 上海工程技术大学 电子电气工程学院,上海 201620
  • 收稿日期:2020-01-22 出版日期:2021-05-15 发布日期:2021-05-24
  • 作者简介:张仕森(1993-),男,硕士研究生。研究方向:机器学习。|孙宪坤(1972-),男,博士,教授。研究方向:计算机应用。|尹玲(1986-),女,博士,讲师。研究方向:智能信息处理、软件工程。|李世玺(1995-),男,硕士研究生。研究方向:大数据应用。
  • 基金资助:
    国家自然科学基金青年项目(61802251)

Design of Text Title Generation Prototype System Based on Neural Network

ZHANG Shisen,SUN Xiankun,YIN Ling,LI Shixi   

  1. College of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China
  • Received:2020-01-22 Online:2021-05-15 Published:2021-05-24
  • Supported by:
    National Natural Science Foundation of China(61802251)

摘要:

针对传统人工总结、编写标题的方法在耗费大量人力、时间成本的同时难以应对互联网中大量不规范的文本的问题,文中设计了一种基于神经网络的文本标题生成原型系统。在文本标题生成原型系统中通过基于神经网络编码器-解码器模型对文本进行建模计算,从而经济、高效地生成一条准确、简洁、切合原文的标题。在编码器部分采用双向长短期记忆神经网络,充分利用上下文之间的语义联系。解码器部分则采用单向神经网络进行解码操作,并引入注意力机制来缓解信息丢失,提高标题生成效果。在LCSTS数据集上进行实验得到ROUGE-1、ROUGE-L评价指标分别为29.91和24.68,证明了该标题生成原型系统的有效性。

关键词: 人工智能, 自然语言处理, 神经网络, 标题生成, 原型系统, 词向量, 注意力机制, 生成式技术

Abstract:

In view of the traditional manual methods cost a lot of manpower and time and can not deal with the problem of massive of non-standard texts, a prototype system of generating text titles is designed in the proposed study. In the prototype system, the non-standard text is calculated by the encoder-decoder model which is based on neural network to produce an accurate title. In the encoder part, the bidirectional long short-term memory neural network is adopted to make full use of the semantic connection between contexts. In the decoder part, one-way neural network is used for decoding operation, and attention mechanism is added to alleviate information loss and improve the effect of title generation. The evaluation indexes of ROUGE-1 and ROUGE-L obtained by experiments on LCSTS data set are 29.91 and 24.68, proving the effectiveness of the title generation prototype system.

Key words: artificial intelligence, natural language processing, neural network, title generation, prototype system, word vector, attention mechanism, generative technology

中图分类号: 

  • TP391