ISCAS OpenIR
optimizing spmv for diagonal sparse matrices on gpu
Sun Xiangzheng; Zhang Yunquan; Wang Ting; Zhang Xianyi; Yuan Liang; Rao Li
2011
Conference Name40th International Conference on Parallel Processing, ICPP 2011
SourceProceedings of the International Conference on Parallel Processing
Pages492-501
Conference DateSeptember
Conference PlaceTaipei City, Taiwan
Indexed TypeEI
ISSN0190-3918
ISBN9780769545103
Department(1) Lab. of Parallel Software and Computational Science Institute of Software Chinese Academy of Sciences Beijing China; (2) State Key Lab. of Computer Science Institute of Software Chinese Academy of Sciences Beijing China; (3) Graduate University of Chinese Academy of Sciences Beijing China
English AbstractSparse Matrix-Vector multiplication (SpMV) is an important computational kernel in scientific applications. Its performance highly depends on the nonzero distribution of sparse matrices. In this paper, we propose a new storage format for diagonal sparse matrices, defined as Compressed Row Segment with Diagonal-pattern (CRSD). In CRSD, we design diagonal patterns to represent the diagonal distribution. As the Graphics Processing Units (GPUs) have tremendous computation power and OpenCL makes them more suitable for the scientific computing, we implement the SpMV for CRSD format on the GPUs using OpenCL. Since the OpenCL kernels are complied at runtime, we design the code generator to produce the codelets for all diagonal patterns after storing matrices into CRSD format. Specifically, the generated codelets already contain the index information of nonzeros, which reduces the memory pressure during the SpMV operation. Furthermore, the code generator also utilizes property of memory architecture and thread schedule on the GPUs to improve the performance. In the evaluation, we select four storage formats from prior state-of-the-art implementations (Bell and Garland, 2009) on GPU. Experimental results demonstrate that the speedups reach up to 1.52 and 1.94 in comparison with the optimal implementation of the four formats for the double and single precision respectively. We also evaluate on a two-socket quad-core Intel Xeon system. The speedups reach up to 11.93 and 12.79 in comparison with CSR format under 8 threads for the double and single precision respectively. © 2011 IEEE.; Sparse Matrix-Vector multiplication (SpMV) is an important computational kernel in scientific applications. Its performance highly depends on the nonzero distribution of sparse matrices. In this paper, we propose a new storage format for diagonal sparse matrices, defined as Compressed Row Segment with Diagonal-pattern (CRSD). In CRSD, we design diagonal patterns to represent the diagonal distribution. As the Graphics Processing Units (GPUs) have tremendous computation power and OpenCL makes them more suitable for the scientific computing, we implement the SpMV for CRSD format on the GPUs using OpenCL. Since the OpenCL kernels are complied at runtime, we design the code generator to produce the codelets for all diagonal patterns after storing matrices into CRSD format. Specifically, the generated codelets already contain the index information of nonzeros, which reduces the memory pressure during the SpMV operation. Furthermore, the code generator also utilizes property of memory architecture and thread schedule on the GPUs to improve the performance. In the evaluation, we select four storage formats from prior state-of-the-art implementations (Bell and Garland, 2009) on GPU. Experimental results demonstrate that the speedups reach up to 1.52 and 1.94 in comparison with the optimal implementation of the four formats for the double and single precision respectively. We also evaluate on a two-socket quad-core Intel Xeon system. The speedups reach up to 11.93 and 12.79 in comparison with CSR format under 8 threads for the double and single precision respectively. © 2011 IEEE.
KeywordMemory Architecture Network Components Optimization Program Processors
SponsorshipInt. Assoc. Comput. Commun. (IACC)
Language英语
Content Type会议论文
URIhttp://ir.iscas.ac.cn/handle/311060/16207
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Sun Xiangzheng,Zhang Yunquan,Wang Ting,et al. optimizing spmv for diagonal sparse matrices on gpu[C],2011:492-501.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Sun Xiangzheng]'s Articles
[Zhang Yunquan]'s Articles
[Wang Ting]'s Articles
Baidu academic
Similar articles in Baidu academic
[Sun Xiangzheng]'s Articles
[Zhang Yunquan]'s Articles
[Wang Ting]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Sun Xiangzheng]'s Articles
[Zhang Yunquan]'s Articles
[Wang Ting]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.