ISCAS OpenIR
623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores
Liu, YQ; Yang, C; Liu, FF; Zhang, XY; Lu, YT; Du, YF; Yang, CQ; Xie, M; Liao, XK
2016
SourceINTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS
ISSN1094-3420
Volume30Issue:1Pages:39-54
English AbstractIn this article, we present a new hybrid algorithm to enable and scale the high-performance conjugate gradients (HPCG) benchmark on large-scale heterogeneous systems such as the Tianhe-2. Based on an inner-outer subdomain partitioning strategy, the data distribution between host and device can be balanced adaptively. The overhead of data movement from both the MPI communication and the PCI-E transfer can be significantly reduced by carefully rearranging and fusing operations. A variety of parallelization and optimization techniques for performance-critical kernels are exploited and analyzed to maximize the performance gain on both host and device. We carry out experiments on both a small heterogeneous computer and the world's largest one, the Tianhe-2. On the small system, a thorough comparison and analysis has been presented to select from different optimization choices. On Tianhe-2, the optimized implementation scales to the full-system level of 3.12 million heterogeneous cores, with an aggregated performance of 623 Tflop/s and a parallel efficiency of 81.2%.; In this article, we present a new hybrid algorithm to enable and scale the high-performance conjugate gradients (HPCG) benchmark on large-scale heterogeneous systems such as the Tianhe-2. Based on an inner-outer subdomain partitioning strategy, the data distribution between host and device can be balanced adaptively. The overhead of data movement from both the MPI communication and the PCI-E transfer can be significantly reduced by carefully rearranging and fusing operations. A variety of parallelization and optimization techniques for performance-critical kernels are exploited and analyzed to maximize the performance gain on both host and device. We carry out experiments on both a small heterogeneous computer and the world's largest one, the Tianhe-2. On the small system, a thorough comparison and analysis has been presented to select from different optimization choices. On Tianhe-2, the optimized implementation scales to the full-system level of 3.12 million heterogeneous cores, with an aggregated performance of 623 Tflop/s and a parallel efficiency of 81.2%.
Indexed TypeSCI
KeywordTianhe-2 Hpcg Conjugate Gradients Mic Heterogeneous Computing
DepartmentChinese Acad Sci, Inst Software, Beijing, Peoples R China. Univ Chinese Acad Sci, Beijing, Peoples R China. Chinese Acad Sci, State Key Lab Comp Sci, Beijing, Peoples R China. Natl Univ Def Technol, Dept Comp Sci & Technol, Changsha, Hunan, Peoples R China.
Language英语
WOS IDWOS:000371326000004
Citation statistics
Content Type期刊论文
URIhttp://ir.iscas.ac.cn/handle/311060/17346
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Liu, YQ,Yang, C,Liu, FF,et al. 623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores[J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS,2016,30(1):39-54.
APA Liu, YQ.,Yang, C.,Liu, FF.,Zhang, XY.,Lu, YT.,...&Liao, XK.(2016).623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores.INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS,30(1),39-54.
MLA Liu, YQ,et al."623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores".INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS 30.1(2016):39-54.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Liu, YQ]'s Articles
[Yang, C]'s Articles
[Liu, FF]'s Articles
Baidu academic
Similar articles in Baidu academic
[Liu, YQ]'s Articles
[Yang, C]'s Articles
[Liu, FF]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Liu, YQ]'s Articles
[Yang, C]'s Articles
[Liu, FF]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.