中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
面向字处理的多语言计算模型的研究
作者: 贾彦民
答辩日期: 2007-01-17
授予单位: 中国科学院软件研究所
授予地点: 软件研究所
学位: 博士
关键词: 多语言计算 ; 文字处理 ; 国际化 ; 复杂文字 ; 文本布局方向 ; OpenOffice.org
其他题名: Rearch on the Multilingual Computing Model for Text Processing
摘要: 基于多语言字处理软件的开发实践,论文对多语言计算的概念、范畴、模型和系统框架结构进行研究,重点分析了复杂文字处理与多语言文本布局两个关键部分,取得了五个方面的主要成果: 第一,论文从字符编码、操作系统、字处理应用软件、万维网和编程语言等五个视角概括审视了计算机软件对多语言支持的现状,目前多语言计算的主要局限性包括:系统软件提供的多语言支持只是文本处理的最基本的功能,如字符输入、显示、编码转换等;字处理应用软件多以国际化模型达致多语言处理的目的,缺乏统一的多语言支持机制。 第二,论文针对不同语言在计算机处理过程中的多样性和特殊性,以形式化的方法提出了一种自适应的多语言计算的模型,模型从静态和动态两个方面揭示了多语言计算的本质特征,以此设计的多语言字处理软件体系结构具有良好的自适应性、可扩展性和可配置性。 第三,论文在原有文档格式化模型基础上,提出了支持多语言文本布局方向的文档处理模型,该模型将对文本布局方向的处理封装在文档格式化模块中,把多文本布局方向的问题规约到文本布局方向为水平从左向右的文档格式化的问题,并设计了多文本布局方向文档格式化的递归算法。该模型可很好支持包括我国民族文字蒙古文、维吾尔、藏文在内的各种不同书写方向的文字的文本布局。 第四,复杂文字在显示输出的过程中,表现出极为复杂的语言特征。为此提出了一种基于谓词规则的复杂文字处理模型,模型以谓词规则的方法给出了复杂文字字形布局特征的形式化描述,按照复杂文字处理的过程,设计了实现该模型的软件体系结构,将复杂文字的语言特征从程序控制逻辑中隔离出来,提高了系统的灵活性,便于增加新的复杂文字的支持。在研制蒙古文、藏文、维吾尔文办公套件的应用中表明,该模型是实用有效的。 第五,在上述模型基础上,成功研制了基于开源项目OpenOffice.org的蒙古文、藏文和维吾尔文的办公套件,实现了这三种文字基于Unicode小字符集文本的正确显示,支持这三种文字的多种特性,如蒙古文垂直从左向右的书写方向、藏文自动断行行为和维吾尔文的双向显示。 最后,论文提出的模型已应用在多语言支持字处理软件的开发实践中并得到了验证。
英文摘要: Backed by developing practice in mutilingual text processing software, research on the conception, scope and system framework of multilingual computing is conducted in this dissertation. The key aspects including complex script handling and multilingual text layout are highlighted. As a result, following five principal achievements have been obtained. First, the state of arts of mutilingual support in computer software is investigated generally from five facets including character encoding, operating system, text processing, world wide web and programming language. Some leading limitations is specified in this dissertation, text processing features provided by system software are just the most rudimental functions such as character input, display, transcoding and so on, and I18N model is used to implement mutilingual supporting in developing text processing program because there is no unified mutilingual supporting mechanism. Second, concerning on the multiplicity and speciality with regard to different language text processing in computer, an adaptive multilingual computing formalized model is put forward. The essential features are illuminated from static and dynamic views in this model. The system strcture of multilingual text processing software based on this model is adaptive, expensible and configurable. Third, based on the existed document formatting model, a document processing model supporting multilingual text layout directions is put forward. In this model, the process of text layout is encapsulated in the module of document formatting, the problem of text layout in multi-directions is reduced to the problem of text layout from left to right horizontally, and the document formatting recursive algorithm of text layout in multi-directions is designed. Including Mongolian, Tibetan and Uighur, diffrerent text layout directions of various scripts are supported in this model. Fourth, in processing computer display and printing, complex scripts exhibit very sophisticated language features. A complex scripts processing model based on predication rules is brought forward. The glyph layout features of complex scripts are formalized by predication rules. According to the process of complex scripts processing steps, the software system framework implementing this model is designed. By separating the language features of complex scripts from the programming control logic, the system flexibility is improved. Furthermore, it’s convenient to add supports of new complex scripts. The development of Office Suite for Mongolian, Tibetan and Uighur has proven that this model is useful and effective. Fifth, base on the above models, an office suite supporting Tibetan, Mongolian, and Uigur based on OpenOffice.org has been developed successfully. Tibetan, Mongolian, and Uigur unicode text can be displayed correctly in the system. Many features about the above three languages were implemented in OpenOffice.org 1.1.2 such as the special text writing direction of Mongolian ( from left to right vertically), the line breaking behavior of Tibetan, and bidirectionality of Uighur. Finally, the models in this dissertation is applied and demonstrated in the developing practice of mutilingual software.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/6148
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
10001_200318015000997贾彦民_paper.pdf(2612KB)----限制开放-- 联系获取全文

Recommended Citation:
贾彦民. 面向字处理的多语言计算模型的研究[D]. 软件研究所. 中国科学院软件研究所. 2007-01-17.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[贾彦民]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[贾彦民]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace