片上多核处理器的非阻塞环设计与物理实现(.doc
文本预览下载声明
片上多核处理器的非阻塞环设计与物理实现(
陈胜刚1,
(国防科学技术大学 ,湖南 长沙 410073)
摘要:关键词:;;
中图分类号:TN95 文献标志码: 文章编号:
CHEN Shenggang1, LIU Biwei1, QI Juan1, HUA Yingzhao1, XING Sufang1, DINGYanping1
(1.College of Computer, National University of Defense Technology, Changsha 410073, China)
Abstract: A bi-directional non-blocking ring architecture is proposed for the multicore processor with relative smaller amount of high-performance cores. The ring consists of five ring layers of three different types for commands, huge data and smaller data transportation, respectively. The source routing strategy is employed and an equipment state control interconnection is designed for congestion management. The router has a bufferless and contention-free structure and each hop only takes one clock cycle, thus minimizing the transmission latency and realizing deterministic routing. Considering the long links and high bandwidth of the Ring, experiments are carried out to find a proper repeater insertion method, and the crosstalk optimizing methods are studied, such as inverter insertion crosswise between two neighborhood lines and arranging neighborhood lines in signal transport direction. Implementation results show that the designed Ring’s bandwidth is 256GByte/s @1GHz, and can fulfill the data communication demands of the digital signal processing applications.
Keywords: Non-blocking Ring; Networks-on-Chip; Delay Optimization; Crosstalk Optimization
随着集成电路工艺的进步,单片上可集成的晶体管数目已经超过10亿,多核体系结构已经成为当前处理器设计的主流架构,片上网络(Networks-on-Chip,NoC)取代传统的总线成为了核间通信的主要方式。激进的多核体系结构通常具备大量的核心,通常被称作众核(ManyCore),但考虑到程序继承性、任务划分的难度以及通信开销,也有许多采用少量强核构成系统,通常被称为片上多核(Multicore)。
在片上多核处理器设计中,为提高峰值性能,通常将大量资源用于存储器、运算单元以及控制器;同时时钟频率提高,流水站的划分过细,触发器所占面积比例也越来越大;最终留给NoC的功能单元以及连线的面积就相对变少。因此,采用虚通道等复杂控制技术的NoC变得不再适用;而环(Ring)由于结构规整、控制容易、协议简单且通信效率较高而被很多片上多核处理器采用,比如CELL[1],Intel MIC[2]等。
本文针对具备高性能内核的核数较少的片上多核处理器,设计了一种非阻塞的双向环形互连结构,采用多层无阻塞设计,并从体系结构和物理设计角度进行了功耗和延时优化。
1 体系结构
图1给出了本文所设计的环及其整体拓扑结构示意图。总共8个路由器(Router)首尾相连构成环型互连环,每个路由器与网络接口(Network Inte
显示全部