ISCAS OpenIR
a locality-based performance model for load-and-compute style computation
Yuan Liang; Zhang Yunquan
2012
Conference NameIEEE International Conference on Cluster Computing
SourceProceedings - 2012 IEEE International Conference on Cluster Computing, CLUSTER 2012
Pages566-571
Conference DateSEP 24-28, 2012
Conference PlaceBeijing, PEOPLES R CHINA
Indexed TypeISTP ; EI
ISSN1552-5244
DepartmentYuan Liang; Zhang Yunquan Chinese Acad Sci Lab Parallel Software & Computat Sci Inst Software Beijing 100864 Peoples R China.
English AbstractThe increasing speed gap between the processor and memory is usually the critical bottleneck in achieving high performance. Hardware caches, programming models, algorithms and data structures have been introduced and proposed to exploit localities on reducing the memory overhead. Some of these new designs share a common load and compute style in which the algorithm first moves all needed data to cache and then performs operations only on the ready data. In this paper, we introduce a locality function to model the reuse ability of an algorithm and propose a corresponding performance model. Then we theoretically analyze how to utilize and design on cache under our model: (1) We present theorems to give the optimal cache partition scheme for the software buffering technique targeting at hiding the memory overhead. (2) We provide methods to decide the optimal multicore design to maximally leverage benefits of both the shared and private caches. (3) We incorporate the memory overhead into the Amdahl's Law to study the speedup limitation on memory bandwidth.; The increasing speed gap between the processor and memory is usually the critical bottleneck in achieving high performance. Hardware caches, programming models, algorithms and data structures have been introduced and proposed to exploit localities on reducing the memory overhead. Some of these new designs share a common load and compute style in which the algorithm first moves all needed data to cache and then performs operations only on the ready data. In this paper, we introduce a locality function to model the reuse ability of an algorithm and propose a corresponding performance model. Then we theoretically analyze how to utilize and design on cache under our model: (1) We present theorems to give the optimal cache partition scheme for the software buffering technique targeting at hiding the memory overhead. (2) We provide methods to decide the optimal multicore design to maximally leverage benefits of both the shared and private caches. (3) We incorporate the memory overhead into the Amdahl's Law to study the speedup limitation on memory bandwidth.
KeywordLocality Function Cache Partition Private Cache Shared Cache
SponsorshipIEEE, IEEE Comp Soc, IEEE Tech Comm Scalable Comp (TCSC), Sugon, Intel, Inspur, VMware, Mellanox, PARATERA, BLSC, LoongStore, Nvidia
SubjectComputer Science
Language英语
Content Type会议论文
URIhttp://ir.iscas.ac.cn/handle/311060/15803
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Yuan Liang,Zhang Yunquan. a locality-based performance model for load-and-compute style computation[C],2012:566-571.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yuan Liang]'s Articles
[Zhang Yunquan]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yuan Liang]'s Articles
[Zhang Yunquan]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yuan Liang]'s Articles
[Zhang Yunquan]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.