中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 会议论文
Title:
automatic fft performance tuning on opencl gpus
Author: Li Yan ; Zhang Yunquan ; Jia Haipeng ; Long Guoping ; Wang Ke
Source: Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
Conference Name: 2011 17th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2011
Conference Date: December 7, 2011 - December 9, 2011
Issued Date: 2011
Conference Place: Tainan, Taiwan
Keyword: Algorithms ; Computer systems ; Discrete Fourier transforms ; Fast Fourier transforms ; Medical imaging
Indexed Type: EI
ISSN: 1521-9097
ISBN: 9780769545769
Department: (1) Laboratory of Parallel Software and Computational Science Institute of Software Chinese Academy of Sciences Beijing China; (2) State Key Lab. of Computer Science Institute of Software Chinese Academy of Sciences Beijing China; (3) Chinese Academy of Sciences Graduate University Beijing China; (4) School of Information Science and Engineering Ocean University Qingdao China
Sponsorship: National Cheng Kung University; National Science Council; Ministry of Education; Academia Sinica; National Center for High Performance Computing
Abstract: School of Information Science and Engineering, Ocean University of China, Qingdao, China Many fields of science and engineering, such as astronomy, medical imaging, seismology and spectroscopy, have been revolutionized by Fourier methods. The fast Fourier transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. The emerging class of high performance computing architectures, such as GPU, seeks to achieve much higher performance and efficiency by exposing a hierarchy of distinct memories to programmers. However, the complexity of GPU programming poses a significant challenge for programmers. In this paper, based on the Kronecker product form multi-dimensional FFTs, we propose an automatic performance tuning framework for various OpenCL GPUs. Several key techniques of GPU programming on AMD and NVIDIA GPUs are also identified. Our OpenCL FFT library achieves up to 1.5 to 4 times, 1.5 to 40 times and 1.4 times the performance of clAmdFft 1.0 for 1D, 2D and 3D FFT respectively on an AMD GPU, and the overall performance is within 90% of CUFFT 4.0 on two NVIDIA GPUs. © 2011 IEEE.
English Abstract: School of Information Science and Engineering, Ocean University of China, Qingdao, China Many fields of science and engineering, such as astronomy, medical imaging, seismology and spectroscopy, have been revolutionized by Fourier methods. The fast Fourier transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (DFT) and its inverse. The emerging class of high performance computing architectures, such as GPU, seeks to achieve much higher performance and efficiency by exposing a hierarchy of distinct memories to programmers. However, the complexity of GPU programming poses a significant challenge for programmers. In this paper, based on the Kronecker product form multi-dimensional FFTs, we propose an automatic performance tuning framework for various OpenCL GPUs. Several key techniques of GPU programming on AMD and NVIDIA GPUs are also identified. Our OpenCL FFT library achieves up to 1.5 to 4 times, 1.5 to 40 times and 1.4 times the performance of clAmdFft 1.0 for 1D, 2D and 3D FFT respectively on an AMD GPU, and the overall performance is within 90% of CUFFT 4.0 on two NVIDIA GPUs. © 2011 IEEE.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/16294
Appears in Collections:软件所图书馆_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Li Yan,Zhang Yunquan,Jia Haipeng,et al. automatic fft performance tuning on opencl gpus[C]. 见:2011 17th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2011. Tainan, Taiwan. December 7, 2011 - December 9, 2011.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Li Yan]'s Articles
[Zhang Yunquan]'s Articles
[Jia Haipeng]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Li Yan]‘s Articles
[Zhang Yunquan]‘s Articles
[Jia Haipeng]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace