Journal of Xidian University ›› 2023, Vol. 50 ›› Issue (6): 148-160.doi: 10.19665/j.issn1001-2400.20230308
• Information and Communications Engineering & Computer Science and Technology • Previous Articles Next Articles
LI Zhao1(),HUANG Chengcheng1(),HE Yizhi1(),SU Xiaojie2()
Received:
2022-11-04
Online:
2023-12-20
Published:
2024-01-22
CLC Number:
LI Zhao,HUANG Chengcheng,HE Yizhi,SU Xiaojie. Research on the fast implementation method of Winograd transposed convolution[J].Journal of Xidian University, 2023, 50(6): 148-160.
[1] | YU J, HU Y, NING X, et al. Instruction Driven Cross-Layer CNN Accelerator with Winograd Transformation on FPGA[C]// 2017 International Conference on Field Programmable Technology(ICFPT).Piscataway:IEEE, 2017:227-230. |
[2] | LU L, LIANG Y, XIAO Q, et al. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs[C]// 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM).Piscataway:IEEE, 2017:101-108. |
[3] | SHEN J, HUANG Y, WANG Z, et al. Towards a Uniform Template-Based Architecture for Accelerating 2D and 3D CNNs on FPGA[C]// The 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays(FPGA'18). New York: ACM, 2018:97-106. |
[4] | LIU X Y, POOL J, HAN S, et al. Efficient Sparse-Winograd Convolutional Neural Network[C]// Proceedings of the 6th International Conference on Learning Representations(ICLR 2018).Appleton:ICLR, 2018:1-10. |
[5] | WEI X, YU C, ZHANG P, et al. Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs[C]// 2017 54th ACM/EDAC/IEEE Design Automation Conference(DAC).Piscataway:IEEE, 2017:1-6. |
[6] |
YANG C, WANG Y, WANG X, et al. WRA:A 2.2-to-6.3 TOPS Highly Unified Dynamically Reconfigurable Accelerator Using a Novel Winograd Decomposition Algorithm for Convolutional Neural Networks[J]. IEEE Transactions on Circuits and Systems I:Regular Papers, 2019, 66(9):3480-3493.
doi: 10.1109/TCSI.8919 |
[7] |
YEPEZ J, KO S B. Stride 2 1-D,2-D,and 3-D Winograd for Convolutional Neural Networks[J]. IEEE Transactions on Very Large Scale Integration Systems, 2020, 28(4):853-863.
doi: 10.1109/TVLSI.92 |
[8] | DENG H, WANG J, YE H, et al. 3D-VNPU:A Flexible Accelerator for 2D/3D CNNs on FPGA[C]// Proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines(FCCM 2021).Piscataway:IEEE, 2021:181-185. |
[9] |
SHEN J, HUANG Y, WEN M, et al. Toward an Efficient Deep Pipelined Template-Based Architecture for Accelerating the Entire 2-D and 3-D CNNs on FPGA[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(7):1442-1455.
doi: 10.1109/TCAD.43 |
[10] | WANG Z L, LAN Q, HE H J, et al. Winograd Algorithm for 3D Convolution Neural Networks[C]// Proceedings of the 26th International Conference on Artificial Neural Networks(ICANN 2017).Berlin:Springer, 2017:609-616. |
[11] | KIM M, PARK C, KIM S, et al. Efficient Dilated-Winograd Convolutional Neural Networks[C]// 2019 IEEE International Conference on Image Processing(ICIP).Piscataway:IEEE, 2019:2711-2715. |
[12] |
DING W, HUANG Z Y, HUANG Z K, et al. Designing Efficient Accelerator of Depthwise Separable Convolutional Neural Network on FPGA[J]. Journal of Systems Architecture, 2019, 97:278-286.
doi: 10.1016/j.sysarc.2018.12.008 |
[13] | KNAPHEIDE J, STABERNACK B, KUHNKE M. A High Throughput MobileNetV2 FPGA Implementation Based on a Flexible Architecture for Depthwise Separable Convolution[C]// 2020 30th International Conference on Field-Programmable Logic and Applications(FPL).Piscataway:IEEE, 2020:277-283. |
[14] |
YAN J, YIN S, TU F, et al. GNA:Reconfigurable and Efficient Architecture for Generative Network Acceleration[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 37(11):2519-2529.
doi: 10.1109/TCAD.2018.2857258 |
[15] | ZHANG X, DAS S, NEOPANE O, et al. A Design Methodology for Efficient Implementation of Deconvolutional Neural Networks on an FPGA(2017)[J/OL].[2020-01-01]. https://arxiv.org/abs/1705.02583v1. |
[16] | LIU S, FAN H, NIU X, et al. Optimizing CNN-Based Segmentation with Deeply Customized Convolutional and Deconvolutional Architectures on FPGA[J]. ACM Transactions on Reconfigurable Technology and Systems, 2018, 11(3):1-22. |
[17] | XIA L, DIAO L, JIANG Z, et al. PAI-FCNN:FPGA Based Inference System for Complex CNN Models[C]// 2019 IEEE 30th International Conference on Application-Specific Systems,Architectures and Processors(ASAP).Piscataway:IEEE, 2019:107-114. |
[18] | BAI L, LYU Y, HUANG X. A Unified Hardware Architecture for Convolutions and Deconvolutions in CNN[C]// 2020 IEEE International Symposium on Circuits and Systems(ISCAS).Piscataway:IEEE, 2020:1-5. |
[19] | DI X K, YANG H G, HUANG Z H, et al. Exploring Resource-Efficient Acceleration Algorithm for Transposed Convolution of GANs on FPGA[C]// 2019 International Conference on Field-Programmable Technology(ICFPT).Piscataway:IEEE, 2019:19-27. |
[20] |
DI X K, YANG H G, JIA Y P, et al. Exploring Efficient Acceleration Architecture for Winograd-Transformed Transposed Convolution of GANs on FPGAs[J]. Electronics, 2020, 9(2):1-21.
doi: 10.3390/electronics9010001 |
[21] | CHANG J, AHN S, KANG K, et al. Towards Design Methodology of Efficient Fast Algorithms for Accelerating Generative Adversarial Networks on FPGAs[C]// 2020 25th Asia and South Pacific Design Automation Conference(ASP-DAC).Piscataway:IEEE, 2020:283-288. |
[22] | 须颖, 刘帅, 邵萌, 等. 一种多尺度GAN的低剂量CT超分辨率重建方法[J]. 西安电子科技大学学报, 2022, 49(2):228-236. |
XU Yin, LIU Shuai, SHAO Meng, et al. Multi-Scale Generation Antagonistic Network for the Low-Dose CT Images Super-Resolution Reconstruction Algorithm[J]. Journal of Xidian University, 2022, 49(2):228-236. | |
[23] | 高杰, 霍智勇. 一种门控卷积生成对抗网络的图像修复算法[J]. 西安电子科技大学学报, 2022, 49(1):216-224. |
GAO Jie, HUO Zhiyong. Algorithm for Image Inpainting in Generative Adversarial Networks Based on Gated Convolution[J]. Journal of Xidian University, 2022, 49(1):216-224. | |
[24] |
李斌, 齐延荣, 周清雷. 基于Winograd算法的目标检测加速器设计与优化[J]. 电子学报, 2022, 50(10):2387-2397.
doi: 10.12263/DZXB.20201371 |
LI Bin, QI Yanrong, ZHOU Qinglei. Design and Optimization of Target Detection Accelerator Based on Winograd Algorithm[J]. Acta ElectronicaSinica, 2022, 50(10):2387-2397.
doi: 10.12263/DZXB.20201371 |
|
[25] | HUANG C C, DONG X X, LI Z, et al. Efficient Stride 2 Winograd Convolution Method Using Unified Transformation Matrices on FPGA[C]// 2021 International Conference on Field-Programmable Technology(ICFPT).Piscataway:IEEE, 2021:1-9. |
[1] | WANG Hetong,NIU Shuqiang,SHI Huili,WANG Ping,GUO Lixin,LIU Zhongyu. Design and implementation of the VLC digital baseband system based on FPGA [J]. Journal of Xidian University, 2022, 49(4): 31-38. |
[2] | ZHAO Yiqiang,CAO Yuwen,HE Jiaji,Ma Haocheng,LIU Yanjiang,YE Mao. Design of random pre-obfuscation logic units against EM side-channel attack [J]. Journal of Xidian University, 2022, 49(4): 167-175. |
[3] | HE Shiyang,LI Hui,LI Fenghua. Optimization and implementation of the SM4 on FPGA [J]. Journal of Xidian University, 2021, 48(3): 155-162. |
[4] | QU Bayi,LIU Yehao,ZHANG Taojing,LIU Wei,YU Dongsong,ZHOU Wei. Scheme for miniature time difference measurement with a high resolution and a large range [J]. Journal of Xidian University, 2020, 47(4): 24-30. |
[5] | NGUYEN Van-Truong,CAI Jueping,WEI Linyu,CHU Jie. Low complexity probability-based piecewise linear approximation of the sigmoid function [J]. Journal of Xidian University, 2020, 47(3): 58-65. |
[6] | WANG Dekui. Approach to FPGA placement using resource negotiation [J]. Journal of Xidian University, 2019, 46(6): 17-22. |
[7] | QIAO Ruixiu,CHEN Gang,GONG Guoliang,LU Huaxiang. High performance reconfigurable accelerator for deep convolutional neural networks [J]. Journal of Xidian University, 2019, 46(3): 130-139. |
[8] | XUE Dekuan,LI Guoyang,PAN Xue,FAN Wei,LI Xuechun,ZHU Jianqiang. Design of the data path of the high speed arbitrary waveform generator [J]. Journal of Xidian University, 2019, 46(3): 173-179. |
[9] | ZHENG Ling;QIU Zhiliang;SUN Shiyong;PAN Weitao;WANG Weina;ZHANG Zhiyi. Two-step multiple flow table construction algorithm in the software-defined network [J]. Journal of Xidian University, 2018, 45(5): 25-31. |
[10] | ZHAO Boran;ZHANG Li;SHI Guangming;HUANG Rong;XU Xinran. Design of the programmable neural network processor based on the transport triggered architecture [J]. Journal of Xidian University, 2018, 45(4): 92-98. |
[11] | GUO Qiang;LIU Bo;SI Shengping;LIU Hui;JIANG Yingfu;ZHANG Heng. SRAM-FPGA SEU mitigation method and prediction [J]. Journal of Xidian University, 2018, 45(1): 112-116. |
[12] | DONG Mingyan;LEI Jie;WANG Keyan;LI Yunsong. Highly efficient VLSI architecture for DWT with low-storage implementation [J]. Journal of Xidian University, 2016, 43(2): 35-40. |
[13] | DENG Junyong;LI Tao;JIANG Lin;HAN Jungang;SHEN Xubang. Design and implementation of the graphics accelerator oriented to OpenGL [J]. J4, 2015, 42(6): 124-130. |
[14] |
WANG Hai;ZHOU Wei;LIU Chang-sheng;WANG Shui-sheng.
Novel short time interval measurement method [J]. J4, 2008, 35(2): 267-271. |
|