Title: | 简繁体汉字自动转换系统的设计与实现 |
Author: | 辛春生
|
Issued Date: | 1997
|
Major: | 计算机软件
|
Degree Grantor: | 中国科学院软件研究所
|
Place of Degree Grantor: | 中国科学院软件研究所
|
Degree Level: | 博士
|
Keyword: | 汉字
; 怎么转换系统
; 系统设计
|
Abstract: | 本文首先分折了一个简繁转换系统的设计目标,然后简单介绍了系统支持的交换码集 GB 2312 和 TCA-CNS 11643 及各自的内码双字节高位置“1”和 Big5,随后介绍了系统必须用到的语词切分的基本原理和方法及与切分伴随而来的歧义问题,在给出这些预备知识和作好必要的准备后,详尽地论述了系统的设计思想和实现,讨论了实现时碰到了一些问题。系统对输入语料预处理后,对其进行语词切分,利用消歧规则消除歧义;对消歧规则不能消除的歧义,通过上下文语法和语义分析对其处理;对以上两步均不能消除的歧义,利用词频统计信息对其处理。这样能消除绝大部分的歧义,得到一条切分词链,在这个切分词链上进行语词级的转换。对不确定的语词转换,利用语法分析和语义分析获得的上下知识,消除不确定因素,达到正确转换的目的。另外,对应于不同的应用领域系统自动地不断调整自己的语词库,增加新词、修改词频等,增强自适应能力。系统不但能处理语料中的通用语词,还能转换通用术语和专业术语,做到语词级上的正确转换。 |
English Abstract: | This dissertation, it analyzes the design target of an automatic translator between tradition Chinese and simplified Chinese and introduces the code standards GB 2312 and TCA-CNS 11643, which the system supports, and their internal codes. Then it briefly introduce the principle of word segmentation and segmentation ambiguities. Afterwards, it gives a detailed discussion of the design and implementation of the system and some problems of implementation. The system firstly pre-processes input and applies word segmentation process to it. Then it takes three steps to resolve segmentation ambiguities. First, it uses rule-based approach. Second, it uses the sentence context obtained by grammatical and semantic analysis. Third, it uses statistical approach. Afterwards, the system starts to translate Chinese based on word. For uncertain sections, the system uses the sentence context to resolve it. The system can translate not only general words, but also general idioms and domain specific idioms. Furthermore, it can also adapt itself to domain specific application by modifying its own dictionary and rules. |
Language: | 中文
|
Content Type: | 学位论文
|
URI: | http://ir.iscas.ac.cn/handle/311060/6060
|
Appears in Collections: | 中科院软件所
|
File Name/ File Size |
Content Type |
Version |
Access |
License |
|
N98808.pdf(2135KB) | -- | -- | 限制开放 | -- | 联系获取全文 |
|
Recommended Citation: |
辛春生. 简繁体汉字自动转换系统的设计与实现[D]. 中国科学院软件研究所. 中国科学院软件研究所. 1997-01-01.
|
|
|