ISCAS OpenIR
Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs
Ma, WJ; Gao, K; Long, GP
2016
发表期刊JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY
ISSN1000-9000
卷号31期号:6页码:1262-1274
摘要Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interplay of computation reuse and hardware specifics on application performance. In this paper, we propose an automatic code generator for a class of stencil codes with inherent computation reuse on GPUs. For such applications, the proper reuse of intermediate results, combined with careful register and on-chip local memory usage, has profound implications on performance. Current state of the art does not address this problem in depth, partially due to the lack of a good program representation that can expose all potential computation reuse. In this paper, we leverage the computation overlap graph (COG), a simple representation of data dependence and data reuse with "element view", to expose potential reuse opportunities. Using COG, we propose a portable code generation and tuning framework for GPUs. Compared with current state-of-the-art code generators, our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050.; Computation reuse is known as an effective optimization technique. However, due to the complexity of modern GPU architectures, there is yet not enough understanding regarding the intriguing implications of the interplay of computation reuse and hardware specifics on application performance. In this paper, we propose an automatic code generator for a class of stencil codes with inherent computation reuse on GPUs. For such applications, the proper reuse of intermediate results, combined with careful register and on-chip local memory usage, has profound implications on performance. Current state of the art does not address this problem in depth, partially due to the lack of a good program representation that can expose all potential computation reuse. In this paper, we leverage the computation overlap graph (COG), a simple representation of data dependence and data reuse with "element view", to expose potential reuse opportunities. Using COG, we propose a portable code generation and tuning framework for GPUs. Compared with current state-of-the-art code generators, our experimental results show up to 56.7% performance improvement on modern GPUs such as NVIDIA C2050.
收录类别SCI
关键词Gpgpu Opencl Stencil Code Generation Computation Reuse
部门归属Chinese Acad Sci, Inst Software, Lab Parallel Software & Comp Sci, Beijing 100190, Peoples R China. Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100190, Peoples R China. China Assoc Sci & Technol, Informat Ctr, Beijing 100863, Peoples R China.
语种英语
WOS记录号WOS:000387335600015
引用统计
内容类型期刊论文
URI标识http://ir.iscas.ac.cn/handle/311060/17294
专题中国科学院软件研究所
推荐引用方式
GB/T 7714
Ma, WJ,Gao, K,Long, GP. Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,2016,31(6):1262-1274.
APA Ma, WJ,Gao, K,&Long, GP.(2016).Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY,31(6),1262-1274.
MLA Ma, WJ,et al."Highly Optimized Code Generation for Stencil Codes with Computation Reuse for GPUs".JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 31.6(2016):1262-1274.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Highly+Optimized+Cod(674KB) 开放获取使用许可请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Ma, WJ]的文章
[Gao, K]的文章
[Long, GP]的文章
百度学术
百度学术中相似的文章
[Ma, WJ]的文章
[Gao, K]的文章
[Long, GP]的文章
必应学术
必应学术中相似的文章
[Ma, WJ]的文章
[Gao, K]的文章
[Long, GP]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。