中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 基础软件国家工程研究中心  > 会议论文
学科主题: 固体力学
题名:
Smoothing LDA Model for Text Categorization
作者: Li Wenbo ; Le Sun ; Yuanyong Feng ; Dakun Zhang
会议文集: Lecture Notes in Computer Science
会议名称: 待定
会议日期: 39766
出版日期: 2008
会议地点: Harbin,China
关键词: Text Categorization ; Latent Dirichlet Allocation ; Smoothing ; Graphical Model
出版者: 科学出版社
出版地: 北京
收录类别: EI,ISTP
ISSN: 1234-5678
摘要: Abstract. Latent Dirichlet Allocation (LDA) is a document level language model. In general, LDA employ the symmetry Dirichlet distribution as prior of the topic-words’ distributions to implement model smoothing. In this paper, we propose a data-driven smoothing strategy in which probability mass is allocated from smoothing-data to latent variables by the intrinsic inference procedure of LDA. In such a way, the arbitrariness of choosing latent variables'priors for the multi-level graphical model is overcome. Following this data-driven strategy,two concrete methods, Laplacian smoothing and Jelinek-Mercer smoothing, are employed to LDA model. Evaluations on different text categorization collections show data-driven smoothing can significantly improve the performance in balanced and unbalanced corpora.
语种: 英语
内容类型: 会议论文
URI标识: http://ir.iscas.ac.cn/handle/311060/808
Appears in Collections:基础软件国家工程研究中心_会议论文

Files in This Item:
File Name/ File Size Content Type Version Access License
lwb-conf-01.pdf(389KB)----限制开放-- 联系获取全文

Recommended Citation:
Li Wenbo,Le Sun,Yuanyong Feng,et al. Smoothing LDA Model for Text Categorization[C]. 见:待定. Harbin,China. 39766.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Li Wenbo]'s Articles
[Le Sun]'s Articles
[Yuanyong Feng]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Li Wenbo]‘s Articles
[Le Sun]‘s Articles
[Yuanyong Feng]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace