中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 会议论文
Title:
syncchecker: detecting synchronization errors between mpi applications and libraries
Author: Chen Zhezhe ; Li Xinyu ; Chen Jau-Yuan ; Zhong Hua ; Qin Feng
Source: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012
Conference Name: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012
Conference Date: May 21, 2012 - May 25, 2012
Issued Date: 2012
Conference Place: Shanghai, China
Keyword: Communication ; Computer operating systems ; Distributed parameter networks ; Experiments ; Libraries ; Software testing
Indexed Type: EI
ISBN: 9780769546759
Department: (1) Dept. of Computer Science and Engineering Ohio State University United States; (2) Technology Center of Software Engineering Institute of Software Chinese Academy of Sciences China
Abstract: While improving the performance, nonblocking communication is prone to synchronization errors between MPI applications and the underlying MPI libraries. Such synchronization error occurs in the following way. After initiating nonblocking communication and performing overlapped computation, the MPI application reuses the message buffer before the MPI library completes the use of the same buffer, which may lead to sending out corrupted message data or reading undefined message data. This paper presents a new method called Sync Checker to detect synchronization errors in MPI nonblocking communication. To examine whether the use of message buffers is well synchronized between the MPI application and the MPI library, Sync Checker first tracks relevant memory accesses in the MPI application and corresponding message send/receive operations in the MPI library. Then it checks whether the correct execution order between the MPI application and the MPI library is enforced by the MPI completion check routines. If not, Sync Checker reports the error with diagnostic information. To reduce runtime overhead, we propose three dynamic optimizations. We have implemented a prototype of Sync Checker on Linux and evaluated it with seven bug cases, i.e., five introduced by the original developers and two injected, in four different MPI applications. Our experiments show that Sync Checker detects all the evaluated synchronization errors and provides helpful diagnostic information. Moreover, our experiments with seven NAS Parallel Benchmarks demonstrate that Sync Checker incurs moderate runtime overhead, 1.3-9.5 times with an average of 5.2 times, making it suitable for software testing. © 2012 IEEE.
English Abstract: While improving the performance, nonblocking communication is prone to synchronization errors between MPI applications and the underlying MPI libraries. Such synchronization error occurs in the following way. After initiating nonblocking communication and performing overlapped computation, the MPI application reuses the message buffer before the MPI library completes the use of the same buffer, which may lead to sending out corrupted message data or reading undefined message data. This paper presents a new method called Sync Checker to detect synchronization errors in MPI nonblocking communication. To examine whether the use of message buffers is well synchronized between the MPI application and the MPI library, Sync Checker first tracks relevant memory accesses in the MPI application and corresponding message send/receive operations in the MPI library. Then it checks whether the correct execution order between the MPI application and the MPI library is enforced by the MPI completion check routines. If not, Sync Checker reports the error with diagnostic information. To reduce runtime overhead, we propose three dynamic optimizations. We have implemented a prototype of Sync Checker on Linux and evaluated it with seven bug cases, i.e., five introduced by the original developers and two injected, in four different MPI applications. Our experiments show that Sync Checker detects all the evaluated synchronization errors and provides helpful diagnostic information. Moreover, our experiments with seven NAS Parallel Benchmarks demonstrate that Sync Checker incurs moderate runtime overhead, 1.3-9.5 times with an average of 5.2 times, making it suitable for software testing. © 2012 IEEE.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/15748
Appears in Collections:软件所图书馆_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Chen Zhezhe,Li Xinyu,Chen Jau-Yuan,et al. syncchecker: detecting synchronization errors between mpi applications and libraries[C]. 见:2012 IEEE 26th International Parallel and Distributed Processing Symposium, IPDPS 2012. Shanghai, China. May 21, 2012 - May 25, 2012.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Chen Zhezhe]'s Articles
[Li Xinyu]'s Articles
[Chen Jau-Yuan]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Chen Zhezhe]‘s Articles
[Li Xinyu]‘s Articles
[Chen Jau-Yuan]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace