中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
可伸缩视频编码中的基础算法研究
作者: 杨志杰
答辩日期: 2004
专业: 计算机软件与理论
授予单位: 中国科学院软件研究所
授予地点: 中国科学院软件研究所
学位: 博士
关键词: 流化视频 ; 可伸缩视频编码 ; 率失真优化 ; 三角区域 ; 三角变换
其他题名: Research on Some Fundamental Algorithms in Scalable Video Coding
摘要: 随着近年来互联网和多媒体技术的不断发展,面向互联网的视频流化(streaming)技术已经成为数字视频通信领域的一个新的热点研究方向。可伸缩视频编码,因其可以在不同的质量、帧率、分辨率甚至复杂度下解码,被视为网络环境下的一种很有前景的视频编码方案。然而目前可伸缩视频编码普遍编码效率较低,而且只适用于规则的矩形区域,限制了可伸缩视频编码的广泛应用。本文的研究工作即由此展开。在现有的高级精细可伸缩视频编码(FGS)基础上,本文创新性的提出了基于码率段的率失真优化编码器控制算法和三角区域的广义三角变换算法。前者可以有效的提高可伸缩视频编码的性能,而后者作为数字信号处理的基础算法之一,不仅为非规则区域的可伸缩视频编码提供了一种有效的变换方法,而且对非规则区域上的模式识别,图形图像处理以及几何造型等问题都有着重要的借鉴意义。具体来讲,本论文的主要研究成果如下:首先,提出了一种面向高级FGS的两点率失真优化编码器控制算法。先对可伸缩视频编码中的率失真优化问题进行建模,提出了基于码率段的率失真优化模型,并针对高级FGS将模型进一步简化。通过实验指出了帧间相关性在可伸缩视频编码的率失真优化中不可忽略,并首次提出了通过定义EOD函数来近似帧间相关性的方法。作为一个实例,导出了PFGS中的EOD函数模型,并在此基础上得到了面向PFGS的两点率失真优化编码器控制算法。其次,提出了基于PFGS的两点率失真优化的联合基本层和增强层模式选择算法。将两点率失真优化编码器控制算法应用到PFGS的模式选择中,提出了基于PFGS的两点率失真优化模式选择算法,并系统的讨论了算法中三种常用的权重策略。实验表明,算法可以大大提高整个码率段的编码效率,通过采用不同的权重策略,算法可以灵活的偏向于低码率段或高码率段的编码效率。第三,构造出了三角区域的广义三角函数。DCT是视频编码的核心算法,但是对于三角区域目前没有相应的算法。本文通过求解重心坐标下的Sturm-Liouvine特征方程,构造出了三角区域的广义正弦函数和广义余弦函数,并通过可视化与理论推导相结合的方法系统的研究了这两组函数的性质。第四,提出了三角区域的离散广义三角变换以及相应的快速算法。在三角区域的广义正弦函数和广义余弦函数基础上,定义了三角区域的离散广义正弦变换和离散广义余弦变换。通过构造辅助函数和辅助变换,给出了相应的快速算法,并在此基础上实现了一个基于Matlab的非规则区域广义傅式变换函数库。实验表明,离散广义余弦变换对于光滑的三角采样数据具有良好的去相关作用。
英文摘要: With the rapid development of Internet and multimedia technologies, video streaming over Internet has drawn tremendous attention from researchers of digital video communication. Scalable video coding produces bitstreams decodable at different qualities, frame rates, resolutions and even complexities, and hence is regarded as a promising video coding scheme in the Internet scenario. However, state-of-the-art scalable video coding suffers great performance loss compared with non-scalable video coding, and can only be applied on rectangular video format, which restricts wide application of scalable video coding. This thesis is motivated by these two problems. Based on the current advanced Fine Granularity Scalable (FGS) video coding, the thesis proposed a range-based rate-distortion optimized coder control algorithm together with the generalized trigonometric transform algorithm in the triangular domain. The former can significantly improve the performance of scalable video coding; as a fundamental algorithm in digital signal processing, the latter can provide an efficient candidate transform algorithm not only for irregular domain scalable video coding, but also for other applications in irregular domains, such as pattern recognition, image and graphics processing, geometrical modeling, and so on. The major contributions of this thesis are detailed as follows: Firstly, a two-point rate-distortion optimized coder control algorithm is proposed for advanced FGS video coding. The problem of range-based rate-distortion optimiza-tion(RDO) in scalable video coding is modelled and simplified into a two-point RDO problem for advanced FGS. By investigating the frame dependency in the advanced FGS, this thesis novelly proposes the EOD function to approximate the influence of the parameter selection in the current frame on the following frames. As an example, the EOD model for PFGS(Progressive FGS) is derived, and the two-point rate-distortion optimized (TP-RDO) coder control algorithm is constructed based on this model. Secondly, a two-point rate-distortion optimized joint base layer and enhancement layer mode decision algorithm is proposed for PFGS. The TP-RDO coder control algorithm is applied in mode decision of PFGS and three typical weighting strategies are systematically discussed. Experimental results show that the proposed algorithm can significantly improve the coding efficiency over almost all the rate range. By selecting different weighting strategies, algorithm can flexibly bias to the performance in the low rate range or high rate range. Thirdly, a set of generalized trigonometric functions in the triangular domain are constructed. DCT is the core algorithm in many video coding schemes, while in triangular domain there is no corresponding transform algorithm. By solving the Sturm-Liouville eigen-equation in the barycentric coordinate, the generalized sine and cosine functions in the triangular domain are constructed. Properties of these functions are systematically investigated by visualization and theoretical derivation. Fourthly, discrete generalized trigonometric transform in the triangular domain and related fast algorithms are proposed. Based on the generalized sine and cosine functions in the triangular domain, the discrete generalized sine and cosine transform are defined. Further more, related fast algorithms are constructed by auxiliary functions and transforms. Based on the proposed algorithms, a Matlab subroutine library is implemented. Experimental results show that the discrete generalized cosine transform is helpful for transform coding in triangular domain.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/6676
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
LW013928.pdf(2002KB)----限制开放-- 联系获取全文

Recommended Citation:
杨志杰. 可伸缩视频编码中的基础算法研究[D]. 中国科学院软件研究所. 中国科学院软件研究所. 2004-01-01.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[杨志杰]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[杨志杰]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace