ISCAS OpenIR
automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora
Nuo Minghua; Liu Huidan; Ma Longlong; Wu Jian; Ding Zhiming
2011
会议名称2011 International Conference on Asian Language Processing, IALP 2011
会议录名称Proceedings - 2011 International Conference on Asian Language Processing, IALP 2011
页码177-180
会议日期November 1
会议地点Penang, Malaysia
收录类别EI
ISBN9780769545547
部门归属(1) Institute of Software Chinese Academy of Sciences Beijing China
摘要This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.; This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.
关键词Natural Language Processing Systems
语种英语
内容类型会议论文
URI标识http://ir.iscas.ac.cn/handle/311060/16257
专题中国科学院软件研究所
推荐引用方式
GB/T 7714
Nuo Minghua,Liu Huidan,Ma Longlong,et al. automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora[C],2011:177-180.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Nuo Minghua]的文章
[Liu Huidan]的文章
[Ma Longlong]的文章
百度学术
百度学术中相似的文章
[Nuo Minghua]的文章
[Liu Huidan]的文章
[Ma Longlong]的文章
必应学术
必应学术中相似的文章
[Nuo Minghua]的文章
[Liu Huidan]的文章
[Ma Longlong]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。