西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (3): 155-162.doi: 10.19665/j.issn1001-2400.2021.03.020

• 网络空间安全 • 上一篇    下一篇

SM4算法的FPGA优化实现方法

何诗洋1,2(),李晖1,2(),李凤华1,2,3()   

  1. 1.西安电子科技大学 大数据安全教育部工程研究中心,陕西 西安 710071
    2.西安电子科技大学 网络与信息安全学院,陕西 西安 710126
    3.中国科学院 信息工程研究所,北京 100093
  • 收稿日期:2021-02-02 出版日期:2021-06-20 发布日期:2021-07-05
  • 通讯作者: 李晖
  • 作者简介:何诗洋(1991—),男,西安电子科技大学博士研究生,E-mail:syhe@xidian.edu.cn|李凤华(1966—),男,教授,E-mail:lfh@iie.ac.cn
  • 基金资助:
    国家重点研发计划(2017YFB0802700);陕西省重点研发计划(2019ZDLGY12-09);移动互联网安全陕西省创新团队(2018TD-007)

Optimization and implementation of the SM4 on FPGA

HE Shiyang1,2(),LI Hui1,2(),LI Fenghua1,2,3()   

  1. 1. Engineering Research Center of Big Data Security Ministry of Education,Xidian University,Xi’an 710071,China
    2. School of Cyber Engineering,Xidian University,Xi’an 710126,China
    3. Institute of Information Engineering,Chinese Academy of Sciences,Beijing 10093,China
  • Received:2021-02-02 Online:2021-06-20 Published:2021-07-05
  • Contact: Hui LI

摘要:

数据加密是保证信息安全的重要手段之一。SM4算法具有安全性强、效率高和易于硬件实现等优势,被广泛应用于数据加密领域,而利用硬件特性高效/高速实现SM4算法成为当前研究的热点。针对SM4算法提出的4套硬件架构,并在XILINX KINTEX-7 FPGA上实现。循环型架构面向资源节约优化,消耗193个SLICE,吞吐量为1.27 Gb/s;流水线型架构基于LUT、BRAM、BRAM+REGISTER方法实现,根据不同应用场景,3种方案能够在查找表、寄存器和块内存等资源消耗方面进行权衡和优化,吞吐量最高可达42.10 Gb/s。

关键词: 国密SM4算法, 现场可编程门阵列, 架构优化, 硬件实现

Abstract:

Data encryption is one of the important means to ensure information security.In data encryption,the SM4 algorithm is widely used by considering its advantages of strong security,high efficiency,and easy hardware implementation.Current researchesfocus on hardware-feature based implementation to improve the cost and performanceof the SM4 algorithm.Four sets of hardware architecture are proposed for the SM4 algorithm and implemented on XILINX KINTEX-7 FPGA.The circular architecture is optimized for resource saving,which consumes 193 SLICE,and has a throughput of 1.27 Gb/s;the pipeline architecture is based on the LUT,BRAM,BRAM+REGISTER method implementation.According to different application scenarios,three solutions can be optimized in terms of resource consumption such as lookup tables,registers,and block memory,with the throughput reaching 42.10 Gb/s.

Key words: SM4, field programmable gate array, architecture optimization, hardware implementation

中图分类号: 

  • TP309.7