J4 ›› 2015, Vol. 42 ›› Issue (1): 194-199+212.doi: 10.3969/j.issn.1001-2400.2015.01.031

• 研究论文 • 上一篇    下一篇

时域有限差分法中的GPU加速高效CPML方案

白冰;牛中奇   

  1. (西安电子科技大学 电子工程学院,陕西 西安  710071)
  • 收稿日期:2014-06-06 出版日期:2015-02-20 发布日期:2015-04-14
  • 通讯作者: 白冰
  • 作者简介:白冰(1982-),男,西安电子科技大学博士研究生,E-mail:bbai@mail.xidian.edu.cn.
  • 基金资助:

    国家自然科学基金资助项目(30870577, 61301288);中央高校基本科研业务费资助项目(JB140218,K5051302057)

High performance CPML acceleration scheme with GPU for FDTD

BAI Bing;NIU Zhongqi   

  1. (School of Electronic Engineering, Xidian Univ., Xi'an  710071, China)
  • Received:2014-06-06 Online:2015-02-20 Published:2015-04-14
  • Contact: BAI Bing

摘要:

针对并行CPML存在的计算冗余和访问冗余问题,提出了一种用于时域有限差分法的图形处理器加速无除法联合最小访存CPML更新方案.该方案通过重新安排CPML迭代公式,将除法操作吸收进公式的固定系数中,消去了图形处理器计算中负担繁重的除法操作.该方案进一步通过合并PML区域内时域有限差分法常规场值更新步骤和CPML更新步骤,剔除了这两个步骤中的重复访存,使算法的访存需求最小化.数值验证结果表明,在同等精度下,CPML更新过程和PML区域场值整体计算过程分别减少了70%和44%的计算时间.

关键词: 时域有限差分法, 卷积完全匹配层, 图形处理器, 并行计算, 计算统一设备架构

Abstract:

To overcome computational redundancy and memory-access redundancy of the traditional GPU-accelerated CPML technique, a novel division-free and minimum-access CPML scheme is proposed. In the proposed scheme, the division operators in the CPML method are merged into a series of fixed coefficients by optimally rearranging the iteration process of CPML and then, the reduplicate memory accesses are eliminated by updating the FDTD and CPML operation in the PML region jointly. Experimental results show that the proposed structure can save up to 70% operation time compared with the traditional GPU-CPML technique and 44% of field updating in the PML region, without any loss of accuracy.

Key words: finite difference time domain method, convolution perfectly matched layer, graphics processing unit, parallel computing, compute unified device architecture

中图分类号: 

  • TP391.9