西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (6): 84-95.doi: 10.19665/j.issn1001-2400.2021.06.011
• 智能嵌入式系统结构与软件关键技术专栏 • 上一篇 下一篇
谭雯1(),甘新标1(),白皓1(),肖调杰1(),陈旭光1(),雷书梦2(),刘杰1()
收稿日期:
2021-08-16
出版日期:
2021-12-20
发布日期:
2022-02-24
通讯作者:
甘新标
作者简介:
谭 雯(1996—),女,国防科技大学硕士研究生,E-mail: 基金资助:
TAN Wen1(),GAN Xinbiao1(),BAI Hao1(),XIAO Tiaojie1(),CHEN Xuguang1(),LEI Shumeng2(),LIU Jie1()
Received:
2021-08-16
Online:
2021-12-20
Published:
2022-02-24
Contact:
Xinbiao GAN
摘要:
现实中的数据问题通常被抽象为图。在大数据时代,图数据趋于复杂,这是因为数据量大幅提升,所需要的计算规模迅速增长。大规模的图数据问题对超算平台的存储运算能力具有广泛需求,并对此提出了更高的要求。为了高效地处理大规模图数据,发挥天河超级计算机实验平台的图处理能力,基于现实世界中图结构的小世界性和无尺度性特征,面向评测超级计算机图处理能力的重要基准Graph500,提出一种主要应用于大规模图的图遍历优化方法。这一方法结合了天河平台的体系结构特征,在图结构上应用了顶点排序和优先缓存策略,即将图中顶点按度数从高到低排序,令程序在图遍历阶段优先访问高度数邻居顶点,并将部分关键高度数顶点缓存至天河系统核组内的高速缓存中,以此来减少Graph500基准程序中的无效访存,降低进程间的通信开销,提高访存带宽利用率,从而有效地提升Graph500基准测试程序在天河平台上的性能。面向天河超级计算机系统实验平台提出的应用顶点排序与优先缓存优化方法的VS-Graph500程序,其加速的效果显著,可扩展性好。当图测试规模为237时,全系统稳定测试性能为2 547.13 GTEPS,超过2020年11月Graph500国际排名榜上第7名的数据。
中图分类号:
谭雯,甘新标,白皓,肖调杰,陈旭光,雷书梦,刘杰. 面向超级计算机系统的大规模图遍历优化[J]. 西安电子科技大学学报, 2021, 48(6): 84-95.
TAN Wen,GAN Xinbiao,BAI Hao,XIAO Tiaojie,CHEN Xuguang,LEI Shumeng,LIU Jie. Optimization of large-scale graph traversal for supercomputers[J]. Journal of Xidian University, 2021, 48(6): 84-95.
[1] |
DODDS P S, MUHAMAD R, WATTS D. An Experimental Study of Search in Global Social Networks[J]. Science, 2003, 301(5634):827-829.
doi: 10.1126/science.1081058 |
[2] | BEUTEL A, FALOUTSOS C. User Behavior Modeling and Fraud Detection[J]. Intelligent Systems IEEE, 2016, 2(31):84-86. |
[3] |
BARABASI A-L, ALBERT R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439):509-512.
doi: 10.1126/science.286.5439.509 |
[4] | AGARWAL V, PETRINI F, PASETTO D, et al. Scalable Graph Exploration on Multicore Processors[C]// Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing,Networking,Storage and Analysis.Piscataway:IEEE, 2010:1-11. |
[5] | UENO K, SUZUMURA T. Highly Scalable Graph Search for the Graph500 Benchmark[C]// Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing.New York:ACM, 2012:149-160. |
[6] | BEAMER S, BULUC A, ASANOVIC K, et al. Distributed Memory Breadth-First Search Revisited:Enabling Bottom-Up Search[C]// Proceedings of 2013 IEEE International Symposium on Parallel & Distributed Processing,Workshops and Phd Forum.Piscataway:IEEE, 2013:1618-1627. |
[7] | UENO K, SUZUMURA T, MARUYAMA N, et al. Efficient Breadth-First Search on Massively Parallel and Distributed-Memory Machines[J]. Data Science and Engineering, 2017(2):22-35. |
[8] | BADER D, MADDURI K. Designing Multithreaded Algorithms for Breadth-First Search and St-Connectivity on the Cray MTA-2[C]// Proceedings of 2006 International Conference on Parallel Processing.Piscataway:IEEE, 2006:523-530. |
[9] | LIN H, TANG X, YU B, et al. Scalable Graph Traversal on Sunway Taihulight with Ten Million Core[C]// Proceedings of 2017 IEEE International Parallel and Distributed Processing Symposium.Piscataway:IEEE, 2017:635-645. |
[10] | LIN H, ZHU X, YU B, et al. ShenTu:Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds[C]// International Conference for High Performance Computing,Networking,Storage and Analysis.Piscataway:IEEE, 2018:706-716. |
[11] | PENG Z, LU Y, CHENG Z, et al. A Low Communication Overhead Breadth-First Search Based on Global Bitmap[C]// Proceedings of the 18th International Conference on Algorithms and Architectures for Parallel Processing.Heidelberg:Springer, 2018:114-129. |
[12] | WOMBLE D E, SHANKAR M, JOUBERT W, et al. Early Experiences on Summit:Data Analytics and AI Applications[J]. Ibm Journal of Research and Development, 2019, 63(6):1-2. |
[13] |
GAO T, LU Y, ZHANG B, et al. Using the Intel Many Integrated Core to Accelerate Graph Traversal[J]. International Journal of High Performance Computing Applications, 2014, 28(3):255-266.
doi: 10.1177/1094342014524240 |
[14] | WANG C, LU Y, ZHANG B, et al. An Optimized BFS Algorithm:A Path to Load Balancing in MIC[C]// Proceedings of 2015 IEEE International Conference on Computer and Communications.Piscataway:IEEE, 2015:199-206. |
[15] | 张承龙, 曹华伟, 王国波, 等. 面向高通量计算机的图算法优化技术[J]. 计算机研究与发展, 2020, 6(57):1152-1163. |
ZHANG Chenglong, CAO Huawei, WANG Guobo, et al. Efficient Optimization of Graph Computing on High-Throughput Computer[J]. Journal of Computer Research and Development, 2020, 6(57):1152-1163. | |
[16] | MATSUOKA S. Fugaku and A64FX:the First Exascale Supercomputer and its Innovative Arm CPU[C]// Proceedings of 2021 Symposium on VLSI Circuits.Piscataway:IEEE, 2020:1-3. |
[17] | NAKAO M, UENO K, FUJISAWA K, et al. Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500[C]// Proceedings of 2020 IEEE International Conference on Cluster Computing.Piscataway:IEEE, 2020:408-409. |
[18] | SHIMIZU T. Supercomputer Fugaku:Co-designed with Application Developers/Researchers[C]// Proceedings of 2020 IEEE Asian Solid-State Circuits Conference.Piscataway:IEEE, 2020:1-4. |
[19] | GAN X, ZHANG Y, WANG R, et al. TianheGraph:Customizing Graph Search for Graph500 on Tianhe Supercomputer[J]. IEEE Transactions on Parallel and Distributed Systems, 2021(1):1-12. |
[20] | BAI H, GAN X, XU T, et al. VPC:Pruning Connected Components Using Vector-Based Path Compression for Graph500[J/OL]. [2021-07-28]https://link.springer.com/article/10.1007/s42514-021-00070-z . |
[21] | 甘新标, 谭雯, 刘杰. 基于双向位图的CSR大规模图存储优化[J]. 计算机研究与发展, 2021, 58(3):458-466. |
GAN Xinbiao, TAN Wen, LIU Jie. Bidirectional-Bitmap Based CSR for Reducing Large-Scale Graph Space[J]. Journal of Computer Research and Development, 2021, 58(3):458-466. | |
[22] | Graph500[EB/OL]. [2021-08-02]. https://graph500.org/ . |
[23] | 石嵩, 李宏亮, 朱巍. 阵列众核处理器上的高效归并排序算法[J]. 计算机研究与发展, 2016, 53(2):362-373. |
SHI Song, LI Hongliang, ZHU Wei. Efficient Merge Sort Algorithms on Array-Based Manycore Architecture[J]. Journal of Computer Research and Development, 2016, 53(2):362-373. |
[1] | 陈荣,许宏丽,杨东学,黄华. 一种基于空间编码结构光的稠密三维重建算法[J]. 西安电子科技大学学报, 2021, 48(6): 123-130. |
[2] | 刘云瑞,周水生. 最小二乘损失在多视角学习中的应用[J]. 西安电子科技大学学报, 2021, 48(6): 151-160. |
[3] | 张春祥,周雪松,高雪瑶,刘欢. 融合k均值聚类与LSTM网络的半监督词义消歧[J]. 西安电子科技大学学报, 2021, 48(6): 161-171. |
[4] | 李源,崔玉爽,王伟. 一种基于字词双通道网络的文本情感分析方法[J]. 西安电子科技大学学报, 2021, 48(6): 179-186. |
[5] | 代明军,李晓凤,邓海燕,陈彬. 具有低编解码复杂度的私有信息检索[J]. 西安电子科技大学学报, 2021, 48(6): 212-220. |
[6] | 顾兆军,陈辉,王家亮,高冰. 一种小型四轴飞行器目标跟踪控制算法[J]. 西安电子科技大学学报, 2021, 48(5): 117-127. |
[7] | 董如婵,焦李成,赵进,沈维燕. 一种深度融合机制的遥感图像目标检测技术[J]. 西安电子科技大学学报, 2021, 48(5): 128-138. |
[8] | 王海军,张圣燕,杜玉杰. 响应差异约束的相关滤波无人机目标跟踪算法[J]. 西安电子科技大学学报, 2021, 48(5): 149-155. |
[9] | 张宇浩,程培涛,张书豪,王秀美. 一种自适应权重学习的轻量超分辨率重建网络[J]. 西安电子科技大学学报, 2021, 48(5): 15-22. |
[10] | 程德,郝毅,周靖宇,王楠楠,高新波. 利用混合双通路神经网络的跨模态行人重识别[J]. 西安电子科技大学学报, 2021, 48(5): 190-200. |
[11] | 孙彦景,魏力,张年龙,云霄,董锴文,葛敏,程小舟,侯晓峰. 联合DD-GAN和全局特征的井下人员重识别方法[J]. 西安电子科技大学学报, 2021, 48(5): 201-211. |
[12] | 闫佳,曹玉东,任佳兴,陈冬昊,李晓会. 深度非对称压缩型哈希算法[J]. 西安电子科技大学学报, 2021, 48(5): 212-221. |
[13] | 田春娜,叶彦妤,单笑,丁宇轩,张相南. 自监督视频表征学习综述[J]. 西安电子科技大学学报, 2021, 48(5): 222-230. |
[14] | 王军军,孙岳,李颖. 一种生成对抗网络的遥感图像去云方法[J]. 西安电子科技大学学报, 2021, 48(5): 23-29. |
[15] | 杨静波,赵启军,吕泽均. 维度情感模型下的表情图像生成及应用[J]. 西安电子科技大学学报, 2021, 48(5): 30-37. |
|