Electronic Science and Technology ›› 2021, Vol. 34 ›› Issue (5): 54-60.doi: 10.16180/j.cnki.issn1007-7820.2021.05.010
Previous Articles Next Articles
WANG Yang,WANG Xiaolei,YUAN Ziang,YUAN Ruming
Online:
2021-05-15
Published:
2021-05-24
Supported by:
CLC Number:
WANG Yang,WANG Xiaolei,YUAN Ziang,YUAN Ruming. A Matrix Multiplication Mapping Technology Based on NOC Multi-Core System[J].Electronic Science and Technology, 2021, 34(5): 54-60.
Table 2
Operation cycles of different matrix multiplications"
阶数 | 运算周期 | |||
---|---|---|---|---|
方案1 | 方案2 | 方案3 | 脉动阵列 | |
32 | 2 243 | 2 277 | 3 074 | 2 055 |
64 | 16 835 | 16 983 | 21 016 | 16 391 |
128 | 132 127 | 133 032 | 136 971 | 131 079 |
256 | 1 148 982 | 1 151 318 | 1 071 563 | 1 048 572 |
512 | 9 375 040 | 9 383 437 | 8 404 794 | 8 388 615 |
1 024 | 74 696 304 | 74 918 735 | 67 137 058 | 67 108 871 |
[1] | 张天作. 基于FPGA的矩阵乘法实现方案在全连接深度神经网络前向传播中的性能评估[D]. 北京:北京邮电大学, 2018. |
Zhang Tianzuo. Performance evaluation of matrix multiplication based on FPGA in forward propagation of full connection deep neural network[D]. Beijing:Beijing University of Posts and Telecommunications, 2018. | |
[2] | 王阳, 陶华敏, 肖山竹, 等. 基于脉动阵列的矩阵乘法器硬件加速技术研究[J]. 微电子学与计算机, 2015(11):120-124. |
Wang Yang, Tao Huamin, Xiao Shanzhu, et al. Research on hardware acceleration technology of matrix multiplier based on pulse array[J]. Microelectronics and Computer, 2015(11):120-124. | |
[3] | Kung H T. Systolic communication[C]. San Diego:Proceedings of the International Conference on Systolic Array, 1988. |
[4] | 田翔, 周凡, 陈耀武, 等. 基于FPGA的实时双精度浮点矩阵乘法器设计[J]. 浙江大学学报(工学版), 2008,42(9):1611-1615. |
Tian Xiang, Zhou Fan, Chen Yaowu, et al. Design of real-time double precision floating-point matrix multiplier based on FPGA[J]. Journal of Zhejiang University (Engineering Edition), 2008,42(9):1611-1615. | |
[5] | 周磊涛, 陶耀东, 刘生, 等. 基于FPGA的Systolic乘法技术研究[J]. 计算机工程与科学, 2015,37(9):1632-1636. |
Zhou Leitao, Tao Yaodong, Liu Sheng, et al. Research onSystolic multiplication technology based on FPGA[J]. Computer Engineering and Science, 2015,37(9):1632-1636. | |
[6] | 费敏锐, 熊南, 李韬. 网络化系统时钟同步算法[J]. 中国科学信息科学, 2016,46(11):1527-1541. |
Fei Minrui, Xiong Nan, Li Tao. Clock synchronization algorithm for networked systems[J]. China Science and Information Science, 2016,46(11):1527-1541. | |
[7] | Tan C Y, Ewetz R, Koh C K. Clustering of flip-flops for useful-skew clock tree synconfproc[C]. Jeju lsland:Design Automation Conference, 2018. |
[8] | Shirwaikar M G, Kadayinti N, Sharma D K. Clock skew measurement using an all-digital Sigma-Delta time to digital converter[C]. Hyderabad: International Conference on Embedded Systems, 2017. |
[9] | 谢盈, 吴尽昭. 一种多核系统任务调度算法动态度量方法[J]. 计算机应用研究, 2019(1):132-135. |
Xie Ying, Wu Jingzhao. A dynamic measurement method of multi-core system task scheduling algorithm[J]. Computer Application Research, 2019(1):132-135. | |
[10] | 汪健, 张磊, 赵忠惠, 等. 多核系统中NoC通讯架构的关键技术[J]. 电子科技, 2012,25(6):47-52. |
Wang Jian, Zhang Lei, Zhao Zhonghui, et al. Key technologies of NOC communication architecture in multi-core system[J]. ElectronicScience and Technology, 2012,25(6):47-52. | |
[11] | 张保岗. Mesh拓扑片上网络映射方法研究[D]. 郑州:战略支援部队信息工程大学, 2019. |
Zhang Baogang. Research on network mapping method on Mesh topology chip[D]. Zhengzhou:University of Information Engineering, Strategic Support Force, 2019. | |
[12] | 钱庆松. 异构多核片上网络布局优化研究与实现[D]. 合肥:合肥工业大学, 2017. |
Qian Qingsong. Research and implementation of network layout optimization on heterogeneous multi-core chip[D]. Hefei:Hefei University of Technology, 2017. | |
[13] | Song Y, Jiao R, Zhang D, et al. Performance analysis for matrix-multiplication based on an heterogeneous multi-core SoC[C]. Chengdu:International Conference on ASIC, 2015. |
[14] | 全钊锋. NoC边界扫描测试系统硬件设计[J]. 电子科技, 2015,28(9):63-66. |
Quan Zhaofeng. Hardware design of NoC boundary scan test system[J]. Electronic Science and Technology, 2015,28(9):63-66. | |
[15] | 陈国良. 并行计算: 结构·算法·编程[M]. 北京: 高等教育出版社, 2011. |
Chen Guoliang. Parallel computing: structure, algorithm and programming[M]. Beijing: Higher Education Press, 2011. | |
[16] | 贾迅, 邬贵明, 谢向辉, 等. 双精度浮点矩阵乘协处理器研究[J]. 计算机研究与发展, 2019,56(2):186-196. |
Jia Xun, Wu Guiming, Xie Xianghui, et al. Research on double precision floating-point matrix multiplication coprocessor[J]. Computer Research and Development, 2019,56(2):186-196. | |
[17] | Cong J, Wang J. Automatic interior I/O elimination in systolic array architecture[C]. Boulder:Annual International Symposium on Field-Programmable Custom Computing Machines, 2018. |
[1] | LI Na,GAO Bo,XIE Zongfu. Research on Scheduling Method of Layered Heterogeneous Signal Processing Platform [J]. Electronic Science and Technology, 2022, 35(2): 7-13. |
[2] | GENG Zhaoqian,ZHU Huming,LI Xuming,CHEN Meiqing,YANG Guipeng. A Review: Radar Signal Processing Based on High Performance Computing [J]. Electronic Science and Technology, 2021, 34(9): 1-6. |
[3] | LÜ Tengfei,CHEN Shiping,WANG Lei. Cost Optimization Model for Cloud Resource Allocation Based on Packet Cluster [J]. Electronic Science and Technology, 2019, 32(3): 31-36. |
[4] | XU Wentao,CHEN Sheng. Tube Detection in Low-contrast Chest [J]. , 2015, 28(12): 84-. |
[5] | CHEN Maoqiang. A Task Scheduling Algorithm for Parallel System Based on DAG [J]. , 2014, 27(9): 29-. |
[6] | LIU Wei,ZHU Yu,QI Feng. Cognitive Radio Network Sub-carrier Allocation Algorithm based on Generalized Nash Bargaining Solution [J]. , 2014, 27(5): 169-. |
[7] | LI Xiaobing,ZHANG Tingyuan,SONG Tao,LI Jing. Mobile Relay Communication Technology:Overview and Prospects [J]. , 2014, 27(11): 185-. |
[8] | WANG Xue-Ru. Realization of Dense False Targets Jamming Based on FPGA [J]. , 2014, 27(1): 84-. |
|