Hippo: An enhancement of pipeline-aware in-memory caching for HDFS

ISCAS OpenIR

	Hippo: An enhancement of pipeline-aware in-memory caching for HDFS
	Wei, Lan (1); Lian, Wenbo (1); Liu, Kuien (1); Wang, Yongji (1); Wei, Lan
	2014
会议名称	2014 23rd International Conference on Computer Communication and Networks, ICCCN 2014
会议日期	August 4, 2014 - August 7, 2014
会议地点	Shanghai, China
收录类别	EI
出版地	Institute of Electrical and Electronics Engineers Inc.
ISSN	10952055
ISBN	9781479935727
部门归属	(1) Institute of Software, Chinese Academy of Sciences, China
摘要	In the age of big data, distributed computing frameworks tend to coexist and collaborate in pipeline using one scheduler. While a variety of techniques for reducing I/O latency have been proposed, these are rarely specific for the whole pipeline performance. This paper proposes memory management logic called 'Hippo' which targets distributed systems and in particular 'pipelined' applications that might span differing big data frameworks. Though individual frameworks may have internal memory management primitives, Hippo proposes to make a generic framework that works agnostic of these highlevel operations. To increase the hit ratio of in-memory cache, this paper discusses the granularity of caching and how Hippo leverages the job dependency graph to make memory retention and pre-fetching decisions. Our evaluations demonstrate that job dependency is essential to improve the cache performance and a global cache policy maker, in most cases, significantly outperforms explicit caching by users.; In the age of big data, distributed computing frameworks tend to coexist and collaborate in pipeline using one scheduler. While a variety of techniques for reducing I/O latency have been proposed, these are rarely specific for the whole pipeline performance. This paper proposes memory management logic called 'Hippo' which targets distributed systems and in particular 'pipelined' applications that might span differing big data frameworks. Though individual frameworks may have internal memory management primitives, Hippo proposes to make a generic framework that works agnostic of these highlevel operations. To increase the hit ratio of in-memory cache, this paper discusses the granularity of caching and how Hippo leverages the job dependency graph to make memory retention and pre-fetching decisions. Our evaluations demonstrate that job dependency is essential to improve the cache performance and a global cache policy maker, in most cases, significantly outperforms explicit caching by users.
语种	英语
内容类型	会议论文
URI标识	http://ir.iscas.ac.cn/handle/311060/16610
专题	中国科学院软件研究所
通讯作者	Wei, Lan
推荐引用方式 GB/T 7714	Wei, Lan ,Lian, Wenbo ,Liu, Kuien ,et al. Hippo: An enhancement of pipeline-aware in-memory caching for HDFS[C]. Institute of Electrical and Electronics Engineers Inc.,2014.