中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
综合递归分块技术及其在LAPACK中的应用
作者: 蒋孟奇
答辩日期: 2007-06-06
授予单位: 中国科学院软件研究所
授予地点: 软件研究所
学位: 博士
关键词: 矩阵计算 ; 分层存储 ; 计算重排序 ; 递归分块 ; 非线性数据结构 ; 综合递归分块技术
其他题名: Integrated Recursive Blocking Technology and Its Application in LAPACK
摘要: 矩阵计算在科学计算和其它很多的领域中有着广泛的应用。LAPACK是一款线性代数函数库,它实现了线性代数计算中的大部分操作,包括矩阵分解、线性方程求解、最小二乘问题和矩阵特征值求解等多类问题,它是矩阵计算中最重要的函数库之一。而随着计算机体系结构的不断发展,尤其是分层存储的出现,LAPACK中的经典矩阵计算算法和传统数据结构已经不太适应新硬件发展的需要了。为了解决这个问题,作为LAPACK性能基础的BLAS突出了对Cache和TLB等高速缓存的考虑,其中以GOTOBLAS为代表。虽然高速BLAS库的性能令人满意,但是它们通常是针对特定的平台进行优化,这种优化的手段缺乏共性,优化的程序缺乏可移植性。本文首先对GOTOBLAS库的实现机制,尤其是其中的GEMM部分的实现,进行了深入的分析,发掘出其中的规律和共性。之后,通过分析存储层次对矩阵计算的影响,结合近年来国内外的一些研究成果,提出了一种矩阵计算的理念——计算重排序理论。在此基础上,通过对比矩阵计算的各种分块算法和数据结构,提出了一种面向存储层次解决矩阵计算问题的方法——综合递归分块方法,并将这种方法应用到Cholesky分解中,对比实验表明,采用该方法既可以提高执行效率,又可以节省存储空间。最后,进一步说明了,有必要将存储层次对程序性能的影响提高到计算模型的高度。
英文摘要: Matrix computations are widely used in numerical computations and other areas. LAPACK is a library of numerical linear algebra, which solves the most commonly occurring problems in numerical linear algebra, such as matrix factorizations, linear equations, linear least squares problems and eigenvalue problems. Along with the development of the computer architecture, especially the appearance of hierarchical memory, the classical algorithms and data structures in LAPACK have not been adapted to these changes. To deal with these problems, BLAS which is critical to the performance of LAPACK, takes the impact of cache and TLB into count. GOTOBLAS is a classic representation of the high-performance BLAS. This thesis analyses the implementation mechanism of GOTO library, especially the implementation of its GEMM routines, and shows how to achieve high performance in it. And then analyses the impact of hierarchical memory on the performance of matrix computations, and proposes a concept of computation reordering and a hierarchical memory oriented method -- Integrated Recursive Blocking Technology, which is used in Cholesky decomposition. The experimental results show that this method is not only can improve the performance, but also can save the storage space. Both theoretical analysis and experimental results show the fact that it is necessary to incorporate the impact of hierarchical memory into the computational models.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/6086
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
10001_200428015029053蒋孟奇_paper.doc(1250KB)----限制开放-- 联系获取全文

Recommended Citation:
蒋孟奇. 综合递归分块技术及其在LAPACK中的应用[D]. 软件研究所. 中国科学院软件研究所. 2007-06-06.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[蒋孟奇]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[蒋孟奇]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace