ISCAS OpenIR  > 基础软件国家工程研究中心
云计算中数据分布关键技术的研究
其他题名The research of key technologies of data distribution in Cloud
陈超
专业计算机软件与理论
导师丁治明
2011-06-01
学位授予单位中国科学院研究生院
学位硕士
学位授予地点北京
关键词负载均衡 云计算 Date Ddate 数据分布
摘要    数据分布是研究数据如何分布到多个资源节点的NP-Complete 问题,是云计算、普适计算、网格计算、分布式计算、P2P 网络等多节点分布式系统中的关键技术之一,对系统性能、可靠可信性、资源配置等方面有着重要影响。随着云计算的发展,云数据分布已成为云计算中不可或缺的部分,并在负载均衡、节约能源、系统安全等众多领域发挥着重要作用。
   现有数据分布策略在总时域负载均衡中已经取得了不错的结果,在地域分布
式系统中的也有很好的表现。但是,由于现有云基础架构大多为大规模集群,相
对于以往的地域分布式系统,其带宽已有极大的改善,故数据分布策略需要弱化
以往以带宽为主的评价指标。此外,总时域均衡的数据分布策略短时域数据访问
并不一定均衡,而短时域的不均衡极有可能引发系统瓶颈。
   本文在调研多种数据分布策略的基础上,依据云数据的松散特质、短时域的
访问规律和云系统的基础架构,提出云数据分布过程的数学模型,提出并实现了
云中的数据分布系统,设计并实现了基于时序片段评价的集中式和分布式数据分
布策略。策略将总时域切分为时间片段,将多目标优化问题转化为单目标问题,
采用反馈评价的方式调整各个资源节点中的数据。该策略均衡了系统各个节点在
各个短时域中的负载。经多次随机模拟实验和云环境实验表明,基于时序片段评
价的数据分布策略相对于常见的数据分布策略而言在系统总时段均衡、每时段的
系统均衡、系统时段最大波峰这三个指标上取得了较好结果。在数据迁移量指标
上,分布式策略比集中式策略有显著改进。
其他摘要

   Data distribution is NP-Complete problem which studies how to place data into resource nodes. Data distribution is one of key technologies of multi-nodes distributed systems, such as cloud computing, pervasive computing, grid computing, distributed
computing, and P2P network. Data distribution significantly influences system performance, reliability, trust, and resource allocation. With the development of cloud computing, cloud data distribution becomes an indispensable part of cloud computing and plays an important role in numerous areas, such as load balance, energy conservation, and system security.


   Current existing data distribution policies have gotten excellent result in whole time domain. They also have the similar performance in geographical distributed system.
Compared with the former geographical distributed systems, however, most of existing cloud infrastructures are large-scale clusters, of which bandwidth has been tremendously improved. As a result, data distribution policy needs to weaken the importance of some main factors which are influenced by bandwidth. Additionally, data distribution policy which is in balance in the whole time domain may has an unbalanced load in each shorter time domain. And bottleneck of system may derive from such unbalanced load.

   After studying kinds of data distribution policies, according with the loose characteristic of cloud data, pattern of accessing data in short time domain, and infrastructure of cloud, mathematical model of data distribution process is proposed.
Cloud data distribution system, centralized and distributed data distribution policies which are based on time sequence evaluation are designed and implemented. These policies split the whole time domain into time sequences, transform multi-objective optimization problem to single-objective problem, and use feedback of evaluating result to adjust datasets of nodes. These policies balance load of every node in each short time domain. According to lots of random simulation and cloud environment
experimentation, data distribution policy based on time sequence evaluation has gotten better result than other common data distribution policies in three kinds of evaluations: system balanced in entire time domain, system balanced in each short time domain, and the maximum wave peak of accessing resource nodes in entire time domain. Compared with centralized policy, distributed policy is significantly improved on evaluation of adjusting data size.

项目归属基于时态交通网络的移动对象时空统计分析、数据挖掘及交通敏感导航技术;云存储与云检索系统的研究与开发
学科领域计算机科学技术基础学科其他学科 ; 计算机软件其他学科
语种中文
内容类型学位论文
URI标识http://ir.iscas.ac.cn/handle/311060/10433
专题基础软件国家工程研究中心
推荐引用方式
GB/T 7714
陈超. 云计算中数据分布关键技术的研究[D]. 北京. 中国科学院研究生院,2011.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
大论文-陈超.pdf(1250KB) 开放获取使用许可请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[陈超]的文章
百度学术
百度学术中相似的文章
[陈超]的文章
必应学术
必应学术中相似的文章
[陈超]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。