中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
基于CoP建模的信息过滤技术研究
作者: 陈晋川
答辩日期: 2004
专业: 计算机软件与理论
授予单位: 中国科学院软件研究所
授予地点: 中国科学院软件研究所
学位: 博士
关键词: 信息过滤 ; 向量空间模型 ; 星型聚类
其他题名: Research on Technique of Information Filtering Based on CoP Modeling CHEN Jinchuan(Computer Software and Theory)
摘要: 目前越来越多的企业采用信息管理或知识管理系统来提高工作效率,在这样的系统中,企业员工实际在虚拟的协作环境下工作,他们需要得到与自己工作任务相关的及时有效的信息支持。传统的信息过滤技术仅根据用户的兴趣来作信息过滤和推荐,难以满足用户要求。为此,本文提出一种基于CoP(Comnluoites of Practice)建模的信息过滤方法,并对其中关键技术进行了研究。具体研究工作包括:针对企业协作环境下用户的信息需求,提出了基于CoP建模的信息过滤方法。在企业协作环境下,企业员工经常会而临全新的工作和任务,出于对工作的不了解,他们不能产生相关的信息需求,从而传统的信息过滤方法不能为他们的工作提供及时有效的信息推荐。CoP指为了彼此共享知识以及在工作中相互学习而建立的小组,它的兴趣是其成员任务的反映。基于已有的信息过滤研究与证据理论,本文对COP的兴趣进行建模,得到CoP的兴趣特征,并以此为基础研究和实现了而向CoP,的信息过滤技术。提出了一种基于领域的向量空间模型。在同一个信息过滤系统中,信息、用户兴趣和CoP兴趣的表示模型应该是一致的。日前广一为使用的向量空问模型_直观、简明、实现方使,但只能表达用户感兴趣的关键词,而不能很好地区别用户兴趣之问的差异,并且关健词数量过多导致了算法效率降低。针对此问题,本文提出一种华于领域的向量空间模型,建立了个领域分类模型,并给出了计算信息在各领域中权值的方法。该模型能大幅度降低维度,很好地体现用户兴趣的多样性,并且可继续采用相似性度量公式等在向量空间模型中的成熟技术。对已有自满进行改进,提出了一种加权的星型聚类算法以学习用户的兴趣特征。Cop的兴趣特征是由其成员用户的兴趣特征融合而来,因此用户兴趣件寺征的获取是CoP建模的基础。用户兴趣特征的学习算法是目前信息过滤研究的热点,聚类算法综合了目前流行的两类算法议Rochio和kNN的优点,但传统的聚类算法不能体现用户对文档的不同兴趣度。本文提出的加权的足型聚类算法倾向于围绕用户兴趣度高的文档来构造文档簇,从而更好地反映用户的信息需求。本文的研究成果可为企业协作场景下的信息过滤工作提供很好的理论和应用参考。
英文摘要: Today, more and more enterprises adopt Information Management, or Knowledge Management Systems to enhance efficiency. Actually, the employees in such systems are working in a virtual collaborative environment. They need timely and valid information support related to their tasks. Traditional information filtering techniques, which take the interests of users into account only, fail to satisfy this requirement. In this thesis, we introduce an information filtering approach based on CoP (Communities of Practice) modeling, and investigate its key techniques. The overall research effort has been broken down to a set of detailed research works: Aiming at the users' requirements for information in enterprises collaborative environment, this research offers an information filtering approach based on CoP modeling. Hnterpri.se employees in a collaborative environment often face new jobs or tasks. Because they are unfami 1 iar with the new tasks, they cannot provide related information requirements. Thus, traditional information filtering techniques are unable to produce timely and val id information support to these employees. Col5 means a kind of teams that are set up to share knowledge and help team members to learn from each other during work. Its profile is the reflection of the tasks of team members. Based on former information filtering approaches and the D-S theory, this thesis gains Col's profile by modeling the interest of CoP, studies and implements the CoP orientated information filtering technique. A domain-based vector space model is provided by this research. The denotation models of information, user interests and CoP interests in an i nformat ion filteng system shoul d be consi stent. The most popular vector space model is direct, concise and easy to be implemented. However, it can only express the user-interested keywords and cannot distinguish the different interests of users well. Furthermore, the Large amounts of keywords result in the fall of algorithm' s efficiency. To solve this problem, this thesis provides a domain-based vector space model by bu i Uling a classification model and offeri ng the method to calculate the documents' weight in the domains. This model can greatly reduce the number of dimensions, incarnate the variety of users' interests well, and adopt many mature techniques in the vector space model, such as comparability formula. Based on some traditional algorithms, this thesis provides a weighted star cluster algorithm to learn user profile. The profile of CoP is fused from the profiles of its members, so the gaining of the user profiles is the base of CoP modeling. The learning algorithms to build user profile are the hot point in the current researches of information filtering. Cluster algorithm integrates the advantages of Rocchio and kNN. But conventional cluster algorithms cannot embody the different interests of users. This thesis brings forward a weighted star cluster algorithm that can reflect the users' information requirements better through focusing on the documents in which.users are most interested to build clusters. Our investigation provides a valuable reference for the future work of information filtering in collaborative environment.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/6330
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
LW014092.pdf(2694KB)----限制开放-- 联系获取全文

Recommended Citation:
陈晋川. 基于CoP建模的信息过滤技术研究[D]. 中国科学院软件研究所. 中国科学院软件研究所. 2004-01-01.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[陈晋川]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[陈晋川]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace