中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 会议论文
Title:
optimizing spmv for diagonal sparse matrices on gpu
Author: Sun Xiangzheng ; Zhang Yunquan ; Wang Ting ; Zhang Xianyi ; Yuan Liang ; Rao Li
Source: Proceedings of the International Conference on Parallel Processing
Conference Name: 40th International Conference on Parallel Processing, ICPP 2011
Conference Date: September
Issued Date: 2011
Conference Place: Taipei City, Taiwan
Keyword: Memory architecture ; Network components ; Optimization ; Program processors
Indexed Type: EI
ISSN: 0190-3918
ISBN: 9780769545103
Department: (1) Lab. of Parallel Software and Computational Science Institute of Software Chinese Academy of Sciences Beijing China; (2) State Key Lab. of Computer Science Institute of Software Chinese Academy of Sciences Beijing China; (3) Graduate University of Chinese Academy of Sciences Beijing China
Sponsorship: Int. Assoc. Comput. Commun. (IACC)
Abstract: Sparse Matrix-Vector multiplication (SpMV) is an important computational kernel in scientific applications. Its performance highly depends on the nonzero distribution of sparse matrices. In this paper, we propose a new storage format for diagonal sparse matrices, defined as Compressed Row Segment with Diagonal-pattern (CRSD). In CRSD, we design diagonal patterns to represent the diagonal distribution. As the Graphics Processing Units (GPUs) have tremendous computation power and OpenCL makes them more suitable for the scientific computing, we implement the SpMV for CRSD format on the GPUs using OpenCL. Since the OpenCL kernels are complied at runtime, we design the code generator to produce the codelets for all diagonal patterns after storing matrices into CRSD format. Specifically, the generated codelets already contain the index information of nonzeros, which reduces the memory pressure during the SpMV operation. Furthermore, the code generator also utilizes property of memory architecture and thread schedule on the GPUs to improve the performance. In the evaluation, we select four storage formats from prior state-of-the-art implementations (Bell and Garland, 2009) on GPU. Experimental results demonstrate that the speedups reach up to 1.52 and 1.94 in comparison with the optimal implementation of the four formats for the double and single precision respectively. We also evaluate on a two-socket quad-core Intel Xeon system. The speedups reach up to 11.93 and 12.79 in comparison with CSR format under 8 threads for the double and single precision respectively. © 2011 IEEE.
English Abstract: Sparse Matrix-Vector multiplication (SpMV) is an important computational kernel in scientific applications. Its performance highly depends on the nonzero distribution of sparse matrices. In this paper, we propose a new storage format for diagonal sparse matrices, defined as Compressed Row Segment with Diagonal-pattern (CRSD). In CRSD, we design diagonal patterns to represent the diagonal distribution. As the Graphics Processing Units (GPUs) have tremendous computation power and OpenCL makes them more suitable for the scientific computing, we implement the SpMV for CRSD format on the GPUs using OpenCL. Since the OpenCL kernels are complied at runtime, we design the code generator to produce the codelets for all diagonal patterns after storing matrices into CRSD format. Specifically, the generated codelets already contain the index information of nonzeros, which reduces the memory pressure during the SpMV operation. Furthermore, the code generator also utilizes property of memory architecture and thread schedule on the GPUs to improve the performance. In the evaluation, we select four storage formats from prior state-of-the-art implementations (Bell and Garland, 2009) on GPU. Experimental results demonstrate that the speedups reach up to 1.52 and 1.94 in comparison with the optimal implementation of the four formats for the double and single precision respectively. We also evaluate on a two-socket quad-core Intel Xeon system. The speedups reach up to 11.93 and 12.79 in comparison with CSR format under 8 threads for the double and single precision respectively. © 2011 IEEE.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/16207
Appears in Collections:软件所图书馆_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Sun Xiangzheng,Zhang Yunquan,Wang Ting,et al. optimizing spmv for diagonal sparse matrices on gpu[C]. 见:40th International Conference on Parallel Processing, ICPP 2011. Taipei City, Taiwan. September.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Sun Xiangzheng]'s Articles
[Zhang Yunquan]'s Articles
[Wang Ting]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Sun Xiangzheng]‘s Articles
[Zhang Yunquan]‘s Articles
[Wang Ting]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace