中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 基础软件国家工程研究中心  > 会议论文
Subject: 固体力学
Title:
Smoothing LDA Model for Text Categorization
Author: Li Wenbo ; Le Sun ; Yuanyong Feng ; Dakun Zhang
Source: Lecture Notes in Computer Science
Conference Name: 待定
Conference Date: 39766
Issued Date: 2008
Conference Place: Harbin,China
Keyword: Text Categorization ; Latent Dirichlet Allocation ; Smoothing ; Graphical Model
Publisher: 科学出版社
Publish Place: 北京
Indexed Type: EI,ISTP
ISSN: 1234-5678
Abstract: Abstract. Latent Dirichlet Allocation (LDA) is a document level language model. In general, LDA employ the symmetry Dirichlet distribution as prior of the topic-words’ distributions to implement model smoothing. In this paper, we propose a data-driven smoothing strategy in which probability mass is allocated from smoothing-data to latent variables by the intrinsic inference procedure of LDA. In such a way, the arbitrariness of choosing latent variables'priors for the multi-level graphical model is overcome. Following this data-driven strategy,two concrete methods, Laplacian smoothing and Jelinek-Mercer smoothing, are employed to LDA model. Evaluations on different text categorization collections show data-driven smoothing can significantly improve the performance in balanced and unbalanced corpora.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/808
Appears in Collections:基础软件国家工程研究中心_会议论文

Files in This Item:
File Name/ File Size Content Type Version Access License
lwb-conf-01.pdf(389KB)----限制开放-- 联系获取全文

Recommended Citation:
Li Wenbo,Le Sun,Yuanyong Feng,et al. Smoothing LDA Model for Text Categorization[C]. 见:待定. Harbin,China. 39766.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Li Wenbo]'s Articles
[Le Sun]'s Articles
[Yuanyong Feng]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Li Wenbo]‘s Articles
[Le Sun]‘s Articles
[Yuanyong Feng]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace