中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 会议论文
Title:
dacoop: accelerating data-iterative applications on map/reduce cluster
Author: Liang Yi ; Li Guangrui ; Wang Lei ; Hu Yanpeng
Source: Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings
Conference Name: 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2011
Conference Date: October 20, 2011 - October 22, 2011
Issued Date: 2011
Conference Place: Gwangju, Korea, Republic of
Keyword: Cache memory ; Cluster computing ; Multitasking ; Scheduling algorithms ; Turnaround time
Indexed Type: EI
ISBN: 9780769545646
Department: (1) Department of Computer Science Beijing University of Technology Beijing China; (2) Institute of Computing Technology Chinese Academy of Sciences Beijing China; (3) Hwellzen Software Center Shanghai China
Abstract: Map/reduce is a popular parallel processing framework for massive-scale data-intensive computing. The data-iterative application is composed of a serials of map/reduce jobs and need to repeatedly process some data files among these jobs. The existing implementation of map/reduce framework focus on perform data processing in a single pass with one map/reduce job and do not directly support the data-iterative applications, particularly in term of the explicit specification of the repeatedly processed data among jobs. In this paper, we propose an extended version of Hadoop map/reduce framework called Dacoop. Dacoop extends Map/Reduce programming interface to specify the repeatedly processed data, introduces the shared memorybased data cache mechanism to cache the data since its first access, and adopts the caching-aware task scheduling so that the cached data can be shared among the map/reduce jobs of data-iterative applications. We evaluate Dacoop on two typical data-iterative applications: k-means clustering and the domain rule reasoning in sementic web, with real and synthetic datasets. Experimental results show that the data-iterative applications can gain better performance on Dacoop than that on Hadoop. The turnaround time of a data-iterative application can be reduced by the maximum of 15.1%. © 2011 IEEE.
English Abstract: Map/reduce is a popular parallel processing framework for massive-scale data-intensive computing. The data-iterative application is composed of a serials of map/reduce jobs and need to repeatedly process some data files among these jobs. The existing implementation of map/reduce framework focus on perform data processing in a single pass with one map/reduce job and do not directly support the data-iterative applications, particularly in term of the explicit specification of the repeatedly processed data among jobs. In this paper, we propose an extended version of Hadoop map/reduce framework called Dacoop. Dacoop extends Map/Reduce programming interface to specify the repeatedly processed data, introduces the shared memorybased data cache mechanism to cache the data since its first access, and adopts the caching-aware task scheduling so that the cached data can be shared among the map/reduce jobs of data-iterative applications. We evaluate Dacoop on two typical data-iterative applications: k-means clustering and the domain rule reasoning in sementic web, with real and synthetic datasets. Experimental results show that the data-iterative applications can gain better performance on Dacoop than that on Hadoop. The turnaround time of a data-iterative application can be reduced by the maximum of 15.1%. © 2011 IEEE.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/16322
Appears in Collections:软件所图书馆_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Liang Yi,Li Guangrui,Wang Lei,et al. dacoop: accelerating data-iterative applications on map/reduce cluster[C]. 见:2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2011. Gwangju, Korea, Republic of. October 20, 2011 - October 22, 2011.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Liang Yi]'s Articles
[Li Guangrui]'s Articles
[Wang Lei]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Liang Yi]‘s Articles
[Li Guangrui]‘s Articles
[Wang Lei]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace