Electronic Science and Technology ›› 2023, Vol. 36 ›› Issue (8): 43-48.doi: 10.16180/j.cnki.issn1007-7820.2023.08.007

Previous Articles     Next Articles

Structured Compression and Acceleration of Network Based on Tiny-YOLOv3

HU Yongyang1,2,LI Miao1,MENG Fankai1,2,ZHANG Feng1,MENG Yiwei1,3,SONG Yukun2   

  1. 1. National ASIC Design Engineering Center,Institute of Automation,Chinese Academy of Sciences, Beijing 100190,China
    2. School of Microelectronics,Hefei University of Technology,Hefei 230009,China
    3. School of Information Engineering,Capital Normal University,Beijing 100048,China
  • Received:2022-03-21 Online:2023-08-15 Published:2023-08-14
  • Supported by:
    National Key R&D Program of China(2018YFB2202604)

Abstract:

In particular application scenarios, Tiny-YOLOv3 network has problems of high resource cost and slow running speed when deployed on embedded platform. This study proposes a structured compression scheme combining pruning and quantization, and establishes a convolutional layer acceleration system for compressed network. The structured compression scheme uses sparse training and channel pruning to reduce the amount of computation in the network, and utilizes fixed-point quantization of activation value and integer power quantization of weight two to reduce the storage of parameters in the network convolution layer. In the convolution layer accelerator system, the programmable logic part designs a convolution layer accelerator core according to the parallel plus pipeline method, and the processing system part is responsible for the scheduling of the convolution layer accelerator system. The experimental results show that the mean average precision of Tiny-YOLOv3 network after structured compression is 0.46, and the parameter compression ratio reaches 5%. When the convolution layer acceleration system is deployed on Xilinx ZYNQ chip, the hardware can run stably at 250 MHz clock frequency, and the calculation force of the convolution operation unit is 36 GOPS. In addition, the overall power consumption of the acceleration platform is 2.6 W, and the hardware design greatly saves hardware resources.

Key words: object detection network, Tiny-YOLOv3, neural network compression, structural pruning, quantization, hardware acceleration, pipeline, ZYNQ

CLC Number: 

  • TP391