ISCAS OpenIR
a novel kernel for text categorization
Zhang Lujiang; Hu Xiaohui
2012
会议名称2012 IEEE International Conference on Computer Science and Automation Engineering, CSAE 2012
会议录名称CSAE 2012 - Proceedings, 2012 IEEE International Conference on Computer Science and Automation Engineering
页码186-190
会议日期May 25, 2012 - May 27, 2012
会议地点Zhangjiajie, China
收录类别EI
ISBN9781467300865
部门归属(1) School of Automation Science and Electrical Engineering Beijing University of Aeronautics and Astronautics Beijing 100191 China; (2) Institute of Software Chinese Academy of Sciences Beijing 100190 China
摘要In this paper we proposed a novel kernel for text categorization. This kernel is an inner product in the feature space generated by all word combinations of specified length. A word combination is a collection of different words co-occurring in the same sentence. The word combination of length k is weighted by the k-th root of the product of the inverse document frequencies (IDF) of its words. A computationally simple and efficient algorithm was proposed to calculate this kernel. We conducted experiments on the 20 Newsgroups dataset. This kernel achieves better performance than the classical word kernel and word-sequence kernel. We also assessed the impact of word combination length on performance. © 2012 IEEE.; In this paper we proposed a novel kernel for text categorization. This kernel is an inner product in the feature space generated by all word combinations of specified length. A word combination is a collection of different words co-occurring in the same sentence. The word combination of length k is weighted by the k-th root of the product of the inverse document frequencies (IDF) of its words. A computationally simple and efficient algorithm was proposed to calculate this kernel. We conducted experiments on the 20 Newsgroups dataset. This kernel achieves better performance than the classical word kernel and word-sequence kernel. We also assessed the impact of word combination length on performance. © 2012 IEEE.
关键词Algorithms Computer Science Support Vector Machines
主办者IEEE Beijing Section; Hunan University of Humanities, Science and Technology; Tongji University; Xiamen University; Central South University
语种英语
内容类型会议论文
URI标识http://ir.iscas.ac.cn/handle/311060/15762
专题中国科学院软件研究所
推荐引用方式
GB/T 7714
Zhang Lujiang,Hu Xiaohui. a novel kernel for text categorization[C],2012:186-190.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Zhang Lujiang]的文章
[Hu Xiaohui]的文章
百度学术
百度学术中相似的文章
[Zhang Lujiang]的文章
[Hu Xiaohui]的文章
必应学术
必应学术中相似的文章
[Zhang Lujiang]的文章
[Hu Xiaohui]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。