中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 软件所图书馆  > 会议论文
Title:
bibclus: a clustering algorithm of bibliographic networks by message passing on center linkage structure
Author: Xu Xiaoran ; Deng Zhi-Hong
Source: Proceedings - IEEE International Conference on Data Mining, ICDM
Conference Name: 11th IEEE International Conference on Data Mining, ICDM 2011
Conference Date: December 11, 2011 - December 14, 2011
Issued Date: 2011
Conference Place: Vancouver, BC, Canada
Keyword: Clustering algorithms ; Data mining ; Digital libraries ; Graph theory ; Inference engines ; Message passing ; Stars
Indexed Type: EI
ISSN: 1550-4786
ISBN: 9780769544083
Department: (1) Key Laboratory of Machine Perception (Ministry of Education) School of Electronics Engineering and Computer Science Peking University Beijing 100871 China; (2) State Key Lab of Computer Science Institute of Software Chinese Academy of Sciences Beijing 100190 China
Sponsorship: National Science Foundation (NSF) - Where Discoveries Begin; University of Technology Sydney; Google; Alberta Ingenuity Centre for Machine Learning; IBM Research
Abstract: Multi-type objects with multi-type relations are ubiquitous in real-world networks, e.g. bibliographic networks. Such networks are also called heterogeneous information networks. However, the research on clustering for heterogeneous information networks is little. A new algorithm, called NetClus, has been proposed in recent two years. Although NetClus is applied on a heterogeneous information network with a star network schema, considering the relations between center objects and all attribute objects linking to them, it ignores the relations between center objects such as citation relations, which also contain rich information. Hence, we think the star network schema cannot be used to characterize all possible relations without integrating the linkage structure among center objects, which we call the Center Linkage Structure, and there has been no practical way good enough to solve it. In this paper, we present a novel algorithm, BibClus, for clustering heterogeneous objects with center linkage structure by taking a bibliographic information network as an example. In BibClus, we build a probabilistic model of pairwise hidden Markov random field (P-HMRF) to characterize the center linkage structure, and convert it to a factor graph. We further combine EM algorithm with factor graph theory, and design an efficient way based on message passing algorithm to inference marginal probabilities and estimate parameters at each iteration of EM. We also study how factor functions affect clustering performance with different function forms and constraints. For evaluating our proposed method, we have conducted thorough experiments on a real dataset that we had crawled from ACM Digital Library. The experimental results show that BibClus is effective and has a much higher quantity than the recently proposed algorithm, NetClus, in both recall and precision. © 2011 IEEE.
English Abstract: Multi-type objects with multi-type relations are ubiquitous in real-world networks, e.g. bibliographic networks. Such networks are also called heterogeneous information networks. However, the research on clustering for heterogeneous information networks is little. A new algorithm, called NetClus, has been proposed in recent two years. Although NetClus is applied on a heterogeneous information network with a star network schema, considering the relations between center objects and all attribute objects linking to them, it ignores the relations between center objects such as citation relations, which also contain rich information. Hence, we think the star network schema cannot be used to characterize all possible relations without integrating the linkage structure among center objects, which we call the Center Linkage Structure, and there has been no practical way good enough to solve it. In this paper, we present a novel algorithm, BibClus, for clustering heterogeneous objects with center linkage structure by taking a bibliographic information network as an example. In BibClus, we build a probabilistic model of pairwise hidden Markov random field (P-HMRF) to characterize the center linkage structure, and convert it to a factor graph. We further combine EM algorithm with factor graph theory, and design an efficient way based on message passing algorithm to inference marginal probabilities and estimate parameters at each iteration of EM. We also study how factor functions affect clustering performance with different function forms and constraints. For evaluating our proposed method, we have conducted thorough experiments on a real dataset that we had crawled from ACM Digital Library. The experimental results show that BibClus is effective and has a much higher quantity than the recently proposed algorithm, NetClus, in both recall and precision. © 2011 IEEE.
Language: 英语
Content Type: 会议论文
URI: http://ir.iscas.ac.cn/handle/311060/16283
Appears in Collections:软件所图书馆_会议论文

Files in This Item:

There are no files associated with this item.


Recommended Citation:
Xu Xiaoran,Deng Zhi-Hong. bibclus: a clustering algorithm of bibliographic networks by message passing on center linkage structure[C]. 见:11th IEEE International Conference on Data Mining, ICDM 2011. Vancouver, BC, Canada. December 11, 2011 - December 14, 2011.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[Xu Xiaoran]'s Articles
[Deng Zhi-Hong]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[Xu Xiaoran]‘s Articles
[Deng Zhi-Hong]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2019  中国科学院软件研究所 - Feedback
Powered by CSpace