中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 基础软件国家工程研究中心  > 期刊论文
Subject: Computer Science (provided by Thomson Reuters)
Title:
基于关联度的汉藏多词单元等价对抽取方法
Alternative Title: collocation based chinese-tibetan multi-word equivalent pair extraction
Author: 诺明花 ; 刘汇丹 ; 吴健 ; 丁治明
Keyword: Tibetan information processing ; multi-word units ; collocation
Source: 中文信息学报
Issued Date: 2012
Volume: 26, Issue:3, Pages:98-103
Department: 中国科学院软件研究所;中国科学院研究生院;
Abstract: 针对为汉藏辅助翻译系统建立汉藏多词单元翻译词典这一任务,该文提出了CMWEPM模型。该模型首先依据关联度和结合度来确定汉语语料中多词单元的边界,然后根据词对齐信息分别抽取严格和约束多词单元等价对,从而形成汉藏多词单元等价对。CMWEPM模型根据不同长度和频次对多词单元进行分类,并为不同类型设定不同阈值,最终提高了汉藏多词单元等价对的召回率,从而能够间接地提高汉藏辅助翻译系统的翻译质量。
English Abstract: This paper aims to construct Chinese-Tibetan multi-word equivalence dictionary for machine-aided translation system.It proposes CMWEPM model which can extract multi-word equivalences in two phases.First,CMWEPM defines the boundary of Chinese multi-word units by collocation and binding degree.Then it extracts strict and constrained multi-word equivalences based on word alignments,respectively.CMWEPM model classifies multi-word units according to its lengths and frequency,and set different thresholds for different types.This strategy can improve the translation quality with higher recall of multi-word equivalent pairs that play a significant role in Chinese-Tibetan machine-aided translation system.
Language: 中文
Content Type: 期刊论文
URI: http://ir.iscas.ac.cn/handle/311060/14671
Appears in Collections:基础软件国家工程研究中心_期刊论文

Files in This Item:
File Name/ File Size Content Type Version Access License
基于关联度的汉藏多词单元等价对抽取方法.pdf(366KB)----限制开放 联系获取全文

Recommended Citation:
诺明花,刘汇丹,吴健,等. 基于关联度的汉藏多词单元等价对抽取方法[J]. 中文信息学报,2012-01-01,26(3):98-103.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[诺明花]'s Articles
[刘汇丹]'s Articles
[吴健]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[诺明花]‘s Articles
[刘汇丹]‘s Articles
[吴健]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace