Title: | tc-dca: a system for text classification based on document's content allocation |
Author: | Li Wenbo
; Sun Le
; Zhang Zhenzhong
; Jiang Xue
; Zhang Weiru
|
Source: | International Conference on Information and Knowledge Management, Proceedings
|
Conference Name: | 19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10
|
Conference Date: | 40842
|
Issued Date: | 2010
|
Conference Place: | Toronto, ON, Canada
|
Keyword: | Knowledge management
; Learning algorithms
; Text processing
; Visualization
|
Publish Place: | United States
|
Indexed Type: | EI
|
ISBN: | 9781450000000
|
Department: | (1) Institute of Software, Chinese Academy of Sciences, 4# South Fourth Street, Zhong Guan Cun, Beijing, China
|
Sponsorship: | ACM SIGIR; ACM SIGWEB; ACM SIGKDD
|
English Abstract: | The text classification methods heavily depend on machine learning algorithms with abstract mathematic metrics, which obstruct the direct observation and intuitive understanding of the text-specific classification. In this paper, we model a document as a Document-Classes-Topics top-down hierarchical structure. Furthermore, by running the document generation procedure, we can obtain each class's content share, which not only can be used to make the classification decision but also can provide a natural visualization approach for text classification. We implement this idea by a new tool named TC-DCA, which provides the visualization of text classification result, where the target document is expressed graphically as its content's allocation on every class. TC-DCA can also perform the drilling down operation to reveal the classification effect of each word of the document. |
Content Type: | 会议论文
|
URI: | http://ir.iscas.ac.cn/handle/311060/8928
|
Appears in Collections: | 基础软件国家工程研究中心_会议论文
|
File Name/ File Size |
Content Type |
Version |
Access |
License |
|
p1937-li.pdf(636KB) | -- | -- | 限制开放 | -- | 联系获取全文 |
|
Recommended Citation: |
Li Wenbo,Sun Le,Zhang Zhenzhong,et al. tc-dca: a system for text classification based on document's content allocation[C]. 见:19th International Conference on Information and Knowledge Management and Co-located Workshops, CIKM'10. Toronto, ON, Canada. 40842.
|
|
|