中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 基础软件国家工程研究中心  > 学位论文
Title:
个性化微博信息推荐系统研究
Author: 彭泽环
Issued Date: 2013-05-28
Supervisor: 孙乐
Major: 计算机软件与理论
Degree Grantor: 中国科学院研究生院
Place of Degree Grantor: 北京
Degree Level: 硕士
Keyword: 个性化 ; 微博 ; 信息推荐 ; 潜在因素模型
Abstract:

微博是目前最热门的互联网应用之一,吸引了数以亿计的用户。通过微博系统用户可以自由地关注感兴趣的人,同时发布、分享、评论感兴趣的信息。目前微博用户每天产生的微博总数超过一亿条,导致社交信息严重过载。推荐系统一直是社交网络研究领域的热点,其中的许多研究成果都已经应用于微博数据中,如用户好友推荐、微博标签推荐、新闻话题推荐,在某种程度上解决了社交信息过载问题。本文试图从用户兴趣圈子的角度为用户推荐热点微博,主要工作如下:

Ÿ   提出了一种融合边权重的改进GN算法,用于检测微博用户的兴趣社区。该算法(WGN)通过逐步删除割边将图分割成一个个独立的点,然后根据模块化指标Q确定最终社区划分。在计算机生成图、微博用户社交图和WebKB共引图上的实验结果表明:(1WGN算法可以有效地检测出用户的兴趣社区;(2)融合边权重可以提高了社区检测的效果。

Ÿ   提出了一个基于潜在因素模型(LFM)融合显式特征和潜在特征的社区热点微博推荐算法(CWR。该算法首先采用随机梯度下降方法在训练数据集上学习出用户对微博的评分模型,然后应用该模型计算测试数据中每个用户对每条微博的评分,最后根据社区中每条微博平均评分筛选出评分较高的社区热点微博推荐给用户。实验结果表明:(1)融合两种特征信息的推荐效果好于使用单一特征信息;(2)和基于转发次数的对照实验(WRR)相比,CWR推荐效果好于WRR;(3)通过分析算法推荐微博的内容发现CWR倾向于为用户推荐兴趣社区相关微博,WRR倾向于为用户推荐公共热点微博。

Ÿ   基于上述两种算法构建了一个实时的个性化微博信息推荐系统SIRMIR。该系统实时获取登录用户的微博社交关系图后检测出用户的兴趣社区,然后基于兴趣社区向用户推荐当日热点微博。

English Abstract:

As one of the most popular web applications, Micro-blog has attracted hundreds of millions of users. Micro-blog users can follow people they are interested in, post and repost statuses, share useful information with others, and so on. Every day, Micro-blog users generate over 100 million statuses, which lead to a serious social information overload problem. Recommendation in micro-blog is a hot research topic in social network, there are many research topics such as user recommendation, tag recommendation and news recommendation have resolved the information overload problem to some extent. In this thesis, we try to recommend useful statuses to users based on interest community (circle), the main contributions are as follows:

Ÿ  Based on GN, we propose a revised social graph community detection algorithm WGN which takes edge weight into consideration. WGN, a graph partition algorithm, is an iterated algorithm which deletes a cut-edge at each time until each vertex is in an individual community. WGN use modularity norm Q to measure the community detection quality. Experiments on computer-generate graph, user social graph and WebKB co-citation graph show that: (1) WGN can accurately discover users’ interest communities; (2) by taking edge weight into consideration, our method can improve the performance of community detection.

Ÿ  We propose a community hot status detection algorithm CWR which can integrate the explicit features and the latent features. Concretely, the proposed algorithm first train the latent factor vectors of users and terms, then combine the latent factor score (by calculating vector inner product) and explicit features score to get the final interest score, finally CWR calculate the average score of a status in the community to discover the hot statuses with high score. Experiments results shows that: (1) the performance of the recommendation system using both kinds of features is better than that using only one type of feature; (2) compared with WRR system (only using status repost times to recommendation), CWR recommend more useful statuses for users in the community; (3)WRR prefer recommending hot status in the whole micro-blog system, which are usually not community-related, in contrast CWR recommend community-related statuses .

Ÿ  Based on the above two algorithms, we build a personalized micro-blog information recommendation system SIRMIR. The system recommends hot statuses to users based on their interest communities.

Content Type: 学位论文
URI: http://ir.iscas.ac.cn/handle/311060/14812
Appears in Collections:基础软件国家工程研究中心_学位论文

Files in This Item:
File Name/ File Size Content Type Version Access License
硕士学位论文-2.4.pdf(2460KB)----限制开放 联系获取全文

Recommended Citation:
彭泽环. 个性化微博信息推荐系统研究[D]. 北京. 中国科学院研究生院. 2013-05-28.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[彭泽环]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[彭泽环]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2020  中国科学院软件研究所 - Feedback
Powered by CSpace