ISCAS OpenIR
a novel kernel for text categorization
Zhang Lujiang; Hu Xiaohui
2012
Conference Name2012 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2012
SourceCSAE 2012 - Proceedings, 2012 IEEE International Conference on Computer Science and Automation Engineering
Pages186-190
Conference DateMay 25, 2012 - May 27, 2012
Conference PlaceZhangjiajie, China
Indexed TypeEI
ISBN9781467300865
Department(1) School of Automation Science and Electrical Engineering Beijing University of Aeronautics and Astronautics Beijing 100191 China; (2) Institute of Software Chinese Academy of Sciences Beijing 100190 China
English AbstractIn this paper we proposed a novel kernel for text categorization. This kernel is an inner product in the feature space generated by all word combinations of specified length. A word combination is a collection of different words co-occurring in the same sentence. The word combination of length k is weighted by the k-th root of the product of the inverse document frequencies (IDF) of its words. A computationally simple and efficient algorithm was proposed to calculate this kernel. We conducted experiments on the 20 Newsgroups dataset. This kernel achieves better performance than the classical word kernel and word-sequence kernel. We also assessed the impact of word combination length on performance. © 2012 IEEE.; In this paper we proposed a novel kernel for text categorization. This kernel is an inner product in the feature space generated by all word combinations of specified length. A word combination is a collection of different words co-occurring in the same sentence. The word combination of length k is weighted by the k-th root of the product of the inverse document frequencies (IDF) of its words. A computationally simple and efficient algorithm was proposed to calculate this kernel. We conducted experiments on the 20 Newsgroups dataset. This kernel achieves better performance than the classical word kernel and word-sequence kernel. We also assessed the impact of word combination length on performance. © 2012 IEEE.
KeywordAlgorithms Computer Science Support Vector Machines
SponsorshipIEEE Beijing Section; Hunan University of Humanities, Science and Technology; Tongji University; Xiamen University; Central South University
Language英语
Content Type会议论文
URIhttp://ir.iscas.ac.cn/handle/311060/15762
Collection中国科学院软件研究所
Recommended Citation
GB/T 7714
Zhang Lujiang,Hu Xiaohui. a novel kernel for text categorization[C],2012:186-190.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhang Lujiang]'s Articles
[Hu Xiaohui]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhang Lujiang]'s Articles
[Hu Xiaohui]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhang Lujiang]'s Articles
[Hu Xiaohui]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.