ISCAS OpenIR
automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora
Nuo Minghua; Liu Huidan; Ma Longlong; Wu Jian; Ding Zhiming
2011
Conference Name2011 International Conference on Asian Language Processing, IALP 2011
SourceProceedings - 2011 International Conference on Asian Language Processing, IALP 2011
Pages177-180
Conference DateNovember 1
Conference PlacePenang, Malaysia
Indexed TypeEI
ISBN9780769545547
Department(1) Institute of Software Chinese Academy of Sciences Beijing China
English AbstractThis paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.; This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.
KeywordNatural Language Processing Systems
Language英语
Content Type会议论文
URIhttp://ir.iscas.ac.cn/handle/311060/16257
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Nuo Minghua,Liu Huidan,Ma Longlong,et al. automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora[C],2011:177-180.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Nuo Minghua]'s Articles
[Liu Huidan]'s Articles
[Ma Longlong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Nuo Minghua]'s Articles
[Liu Huidan]'s Articles
[Ma Longlong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Nuo Minghua]'s Articles
[Liu Huidan]'s Articles
[Ma Longlong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.