电子科技 ›› 2023, Vol. 36 ›› Issue (5): 28-33.doi: 10.16180/j.cnki.issn1007-7820.2023.05.005

• • 上一篇    下一篇

基于自然语言处理的CNAS认可准则自动对标系统

刘玉威1,曹民1,冯浩甲2   

  1. 1.上海理工大学 光电信息与计算机工程学院,上海 200093
    2.山西大学 计算机与信息技术学院,山西 太原 030006
  • 收稿日期:2021-11-14 出版日期:2023-05-15 发布日期:2023-05-17
  • 作者简介:刘玉威(1997-),男,硕士研究生。研究方向:信息获取与大数据分析。|曹民(1970-),男,博士,高级工程师。研究方向:测控仪器及认证认可等创新技术。|冯浩甲(1996-),男,硕士研究生。研究方向:自然语言处理、文本情感分类。
  • 基金资助:
    中国合格评定国家认可委员会2018年科研项目(CNAS-2018-01)

CNAS Recognition Criteria Automatic Benchmarking System Based on Natural Language Processing

LIU Yuwei1,CAO Min1,FENG Haojia2   

  1. 1. School of Optical-Electrical and Computer Engineering,University of Shanghai for Science and Technology, Shanghai 200093,China
    2. School of Computer and Information Technology,Shanxi University, Taiyuan 030006,China
  • Received:2021-11-14 Online:2023-05-15 Published:2023-05-17
  • Supported by:
    2018 Scientific Research Project of China National Accreditation Commission for Conformity Assessment(CNAS-2018-01)

摘要:

在CNAS评审过程中,人工对标不符合项与其依据条款存在着耗时耗力、对标不准确等缺点。文中针对以上问题,提出一种基于自然语言处理的多级模型自动对标方法。通过对不符合项描述的语言特点进行研究,利用注意力机制的Bi-LSTM网络对不符合项进行分类。在该分类下使用基于语料扩充和迁移学习的SimCSE网络模型计算相似的不符合项,并提取对应依据条款,有效解决了对标不准确等问题。模拟实验测试表明,所提方法的对标准确率可达74.4%,语义匹配计算时间相比DSSM模型有大幅提升,内存消耗和最高匹配速度也有明显改善。

关键词: 深度学习, 自然语言处理, 语义匹配, 自动对标, 多标签分类, 多模型融合, 注意力机制, 语义计算

Abstract:

During the CNAS review process, manual benchmarking of non-conformance items and its compliance clauses has the disadvantages of time-consuming, labor-intensive, and inaccurate benchmarking. In view of the above problems, this study proposes a multi-level model automatic benchmarking method based on natural language processing. By studying the linguistic characteristics of the description of non-conforming items, the Bi-LSTM network of the attention mechanism is used to classify non-conforming items. Under this classification, the SimCSE network model based on corpus expansion and transfer learning is used to calculate similar non-conforming items and extract the corresponding basis clauses, which effectively solves the problems such as inaccurate benchmarking.Through simulation experiments, the benchmarking accuracy rate of the proposed method can reach 74.4%, and the semantic matching calculation time is greatly improved when compared with the DSSM model, and the memory consumption and the highest matching speed are also been significantly improved.

Key words: deep learning, natural language processing, semantic matching, benchmarking system, multi label classification, multi model fusion, attention mechanism, semantic computing

中图分类号: 

  • TP391