ISCAS OpenIR  > 2009年期刊/会议论文
research of chinese text classification methods based on semantic vector and semantic similarity
Song Xin; Huang Jia; Zhou Jing-Min; Chen Xi
2009
Conference Name2009 International Forum on Computer Science-Technology and Applications, IFCSTA 2009
SourceIFCSTA 2009 Proceedings - 2009 International Forum on Computer Science-Technology and Applications
Pages187-190
Conference Date40879
Conference PlaceChongqing, China
Indexed Typeei
Publish PlaceUnited States
ISBN9780769539300
Department(1) State Key Laboratory of Software Development Environment, Beihang University, 100191, Beijing, China; (2) Institute of Software Chinese Academy of Sciences, 100190, Beijing, China
English AbstractTo overcome the limitations of traditional text classification approaches based on bag-of-words representation and to effectively incorporate linguistic knowledge and conceptual index into text vector space model, based on two thesaurus HowNet and Tongyici Cilin(hereinafter referred to Cilin), we use semantic vector to describe a document instead of traditional keywords vector, which is based on merging words with high similarity and using a concept to describe the semantic feature rather than a series of words. It not only reduces feature dimension but also adds semantic information to the vector. We also use sentence (document) similarity based on simple vector distance to classify the text and three groups of experiments are made respectively. The results show that the accuracy rates are generally improved along with semantic treatment, which indicates that semantic mining is very important and necessary to text classification. © 2009 IEEE.
KeywordComputer Science Information Retrieval Systems Knowledge Representation Semantics Vector Spaces Vectors
SponsorshipIITAA - International Information Technology; and Applications Association
Language英语
Content Type会议论文
URIhttp://ir.iscas.ac.cn/handle/311060/8434
Collection2009年期刊/会议论文
Recommended Citation
GB/T 7714
Song Xin,Huang Jia,Zhou Jing-Min,et al. research of chinese text classification methods based on semantic vector and semantic similarity[C]. United States,2009:187-190.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Song Xin]'s Articles
[Huang Jia]'s Articles
[Zhou Jing-Min]'s Articles
Baidu academic
Similar articles in Baidu academic
[Song Xin]'s Articles
[Huang Jia]'s Articles
[Zhou Jing-Min]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Song Xin]'s Articles
[Huang Jia]'s Articles
[Zhou Jing-Min]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.