Journal of Xidian University ›› 2024, Vol. 51 ›› Issue (2): 76-83.doi: 10.19665/j.issn1001-2400.20230504

• Information and Communications Engineering • Previous Articles     Next Articles

Study of the parallel MoM on a domestic heterogeneous DCU platform

JIA Ruipeng1(), LIN Zhongchao1(), ZUO Sheng1(), ZHANG Yu1(), YANG Meihong2()   

  1. 1. School of Electronic Engineering,Xidian University,Xi’an 710071,China
    2. School of Computer Science and Technology,Qilu University of Technology,Ji’nan 250000,China
  • Received:2023-03-21 Online:2024-04-20 Published:2023-10-13
  • Contact: LIN Zhongchao E-mail:rpjia@stu.xidian.edu.cn;zclin@xidian.edu.cn;zuosheng0503@163.com;yuzhang@mail.xidian.edu.cn;yangmh@sdas.rog

Abstract:

In view of the current development trend of the domestic supercomputer CPU+DCU heterogeneous architecture,the research on the CPU+DCU massively heterogeneous parallel higher-order method of moments is carried out.First,the basic implementation strategy of DCU to accelerate the calculation of the method of moments is given.Based on the load balancing parallel strategy of the isomorphic parallel moment of methods,an efficient heterogeneous parallel programming framework of "MPI+openMP+DCU" is proposed to address the problem of mismatch between computing tasks and computing power.In addition,the fine-grained task division strategy and asynchronous communication technology are adopted to optimize the design of the pipeline for the DCU computation process,thus realizing the overlapping of computation and communication and improving the acceleration performance of the program.The accuracy of the CPU+DCU heterogeneous parallel moment of methods is verified by comparing the simulation results with those by the finite element method.The scalability analytical results based on the domestic DCU heterogeneous platform show that the implemented CPU+DCU heterogeneous co-computing program can obtain 5.5~7.0 times acceleration effect at different parallel scales,and that the parallel efficiency reaches 73.5% when scaled from 360 nodes to 3600 nodes(1,036,800 cores in total).

Key words: method of moments, domestic heterogeneous platforms, deep computing unit(DCU), parallel algorithm

CLC Number: 

  • TN820