电子科技 ›› 2022, Vol. 35 ›› Issue (9): 44-51.doi: 10.16180/j.cnki.issn1007-7820.2022.09.007

• • 上一篇    下一篇

异构多核SoC处理器内部存储架构优化

张玄,张多利,宋宇鲲   

  1. 合肥工业大学 电子科学与应用物理学院,安徽 合肥 230009
  • 收稿日期:2021-03-24 出版日期:2022-09-15 发布日期:2022-09-15
  • 作者简介:张玄(1993-),男,硕士研究生。研究方向:多核SOC存储控制器。|张多利(1972-),男,博士,研究员。研究方向:多核系统设计、数字信号处理的VLSI实现、片上网络优化。|宋宇鲲(1975-),男,博士,副研究员。研究方向:多核系统设计、数字信号处理的VLSI实现、片上网络优化。
  • 基金资助:
    国家自然科学基金(61874156);安徽省高校协同创新资助项目(GXXT-2019-030)

Optimization of the Internal Memory Architecture of Heterogeneous Multi-Core SoC Processors

ZHANG Xuan,ZHANG Duoli,SONG Yukun   

  1. School of Electronic Science and Applied Physics,Hefei University of Technology,Hefei 230009,China
  • Received:2021-03-24 Online:2022-09-15 Published:2022-09-15
  • Supported by:
    National Natural Science Foundation of China(61874156);Collaborative Innovation Funding Project for Universities in Anhui(GXXT-2019-030)

摘要:

异构多核技术的发展使微处理器的性能有了较大提升,而处理器与外部存储器之间的带宽差异限制了处理器的性能发挥,“存储墙”问题日益严重。针对一种用于高密度计算的异构多核SoC系统,文中提出了一套存储设计方案。该方案通过复用一些长时间闲置的本地空闲存储资源作为二级共享缓存来增加访存带宽,减少访问外部存储频率。分布式高速共享二级缓存结合多路并行访问外部存储的层次化存储结构,缓解了系统处理数据与外部存储器间的速度差异,提高了数据的存取效率,优化了系统的性能。综合资源消耗和计算效率,文中所提设计相比普通二级缓存节约了69.36%的片上SRAM资源,相比无缓存结构提高了41.2%的加速比,整体任务计算时间平均减少了约40.6%。

关键词: 异构多核, 存储墙, 复用, 多路并行, 层次化存储, 二级缓存, 分布式, 外部存储器

Abstract:

The performance of microprocessors has been greatly improved by the development of heterogeneous multi-core technology. The bandwidth difference between the processor and external memory severely limits the performance of the processor, and the "Memory Wall" problem is becoming increasingly serious. For a heterogeneous multi-core SoC system in high-density computing, this study proposes a set of memory design scheme. The solution increases memory access bandwidth and reduces the frequency of accessing external memory by reusing some local free memory resources that have been idle for a long time as shared L2 cache. Meanwhile, the distributed high-speed shared L2 cache combined with the hierarchical storage structure of multi-channel parallel access to external storage alleviates the speed difference between system processing data and external storage, improves data access efficiency, and optimizes system performance. In terms of resource consumption and computing efficiency, the proposed design saves 69.36% of on-chip SRAM resources compared with ordinary L2 cache, provides 41.2% speedup ratio compared with non-cache structure, and reduces the overall task calculation time by about 40.6% on average.

Key words: heterogeneous multicore, memory wall, multiplexing, multiplexed parallelism, hierarchical memory, L2 cache, distributed, external memory

中图分类号: 

  • TN47