电子科技 ›› 2022, Vol. 35 ›› Issue (2): 20-26.doi: 10.16180/j.cnki.issn1007-7820.2022.02.004

• • 上一篇    下一篇

基于FPGA的SqueezeNet推断加速器设计

储萍,倪伟   

  1. 合肥工业大学 电子科学与应用物理学院,安徽 合肥 230009
  • 收稿日期:2020-10-13 出版日期:2022-02-15 发布日期:2022-02-24
  • 作者简介:储萍(1995-),女,硕士研究生。研究方向:神经网络硬件加速。|倪伟(1977-),男,副教授。研究方向:大规模数字集成电路设计、可重构计算系统、深度学习。
  • 基金资助:
    安徽高校协同创新项目(PA2019AGXC0127)

Design of FPGA-Based SqueezeNet Inference Accelerator

CHU Ping,NI Wei   

  1. School of Electronic Science and Applied Physics,Hefei University of Technology,Hefei 230009,China
  • Received:2020-10-13 Online:2022-02-15 Published:2022-02-24
  • Supported by:
    Anhui Colleges Collaborative Innovation Project(PA2019AGXC0127)

摘要:

针对轻量型深度神经网络SqueezeNet存在中间流动数据量大及消耗计算周期长等问题,文中提出以处理块结构划分整个网络来加速计算。每个处理块由Expand层和Squeeze层组成。以Squeeze层结束的处理块结构减少了计算模块与内存间流动的中间数据量,降低了读写消耗。利用激活函数的特性,在核心计算模块引入提前结束卷积计算技术,并为其设计有效索引生存单元、有效索引控制取值单元和卷积判断单元,可跳过卷积计算中无效值占用的计算量和计算周期。实验结果表明,该加速器能减少55.38%的数据流动量,并将无效值所占的计算量和计算周期减少14.68%。

关键词: 轻量型深度网络, SqueezeNet, 处理块, 激活函数, 提前结束卷积计算, 有效索引, 无效值, 计算周期

Abstract:

In view of the problems of the lightweight deep neural network SqueezeNet, such as large amount of intermediate data and long consumption calculation cycle,this study proposes to divide the entire network with a process block structure to speed up the calculation. Each process block is composed of Expand layer and Squeeze layer. The processing block structure ending with the Squeeze layer reduces the amount of intermediate data flowing between the computing module and the memory, and reduces the read and write consumption. The core calculation module introduces the early termination of the convolution calculation technology using the characteristics of the activation function. The effective index survival unit, the effective index control value unit and the convolution judgment unit are designed to skip the calculation amount and calculation cycle occupied by invalid values in the convolution calculation. Experimental results show that the data flow of the accelerator is reduced by 55.38%, and the calculation amount and calculation period occupied by invalid values are reduced by 14.68%.

Key words: lightweight deep neural network, SqueezeNet, process block, activation function, early termination of the convolution calculation, effective index, invalid value, calculation period

中图分类号: 

  • TP183