中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
基于隐式反馈的个性化信息检索技术研究
作者: 吕元华
答辩日期: 2007-06-07
授予单位: 中国科学院软件研究所
授予地点: 软件研究所
学位: 博士
关键词: 个性化信息检索 ; 隐式反馈 ; HITS算法 ; 查询扩展 ; 重排序
其他题名: A Study of Personalized Information Retrieval based on Implicit Feedback
摘要: 搜索引擎是互联网用户最常用的信息查询工具。目前主流的搜索引擎并没有明确区分不同用户的查询意图,而不同用户即使输入相同的查询词,其查询需求也是有差别的。个性化信息检索技术正是解决这一问题的关键技术之一。 本文在对现有的国内外个性化信息检索技术研究现状进行分析和综述的基础上提出了一种基于隐式反馈信息的迭代个性化检索算法,并实现了一个基于客户端的个性化检索工具。本文的主要工作有: 第一, 对目前的个性化信息检索技术进行了较为全面、深入的综述。根据所采用的个性化信息和个性化检索的实现方式,本文对个性化信息检索研究工作进行了分类和探讨,并对一些有代表性的工作进行了介绍和分析。 第二, 提出了一种基于隐式反馈信息的迭代个性化检索算法。基于词和文档之间的相互加强关系(相关文档总是包含许多与查询相关的词,而相关的词总是出现在许多相关文档中),本文提出了一种类似HITS的迭代算法用来计算词和文档的权值,并根据词的权值进行查询扩展,根据文档的权值进行重排序。在迭代算法基础上,利用查询扩展来丰富结果文档,然后通过重排序把文档推荐给用户。实验结果表明,本文提出的个性化检索算法能够有效地提高检索精度。 第三, 基于该算法,我们设计并实现了客户端个性化检索工具PAIR。PAIR可以自动地记录用户的隐式反馈信息,分析和推测用户的兴趣需求,并基于Google和百度的查询结果,以Internet Explorer工具条的形式向用户提供中文和英文的个性化检索服务。
英文摘要: Analysis suggests that, while search engines do a good job of retrieving results to satisfy the range of intentions people may associate with a query, they do not do a very good job of discerning an individual’s unique search goal. To overcome the problems, there have been many attempts to improve retrieval accuracy based on per-sonalized information retrieval technology. We begin this thesis with an in-depth survey on the existing studies concerning personalized information retrieval. Then we proposed an iterative implicit feedback approach to personalized search, and implemented a client-side personalized search agent. The main work of the thesis includes: First, we made a comprehensive survey on the personalized information retrieval technology. The related approaches are classified according to several features, such as the kind of user information that is used, the algorithm used to incorporate the in-formation to information retrieval techniques, etc. Moreover, some representative studies were introduced and discussed. Secondly, we proposed an iterative implicit feedback approach to personalized search. There is a mutual reinforcement principle between documents and terms, which states that relevant documents contain many relevant terms, and that relevant terms occur in many relevant documents. Based on this principle, we designed a HITS-like iterative algorithm to compute the weight for each search result, and for each term being extracted from the implicit feedback information. As a result, query expansion and result re-ranking could be conducted simultaneously according to these weights. Furthermore, it is worth mentioning that, we changed the fashion of the regular query expansion, so that it would just provide diversified search results which, however, must rely on the use of re-ranking to be moved forward and recommended to the user. Experiments on web search show that the proposed approach can improve search accuracy effectively and efficiently. Thirdly, we implemented a client-side personalized search agent PAIR to carry out and evaluate the proposed approach. PAIR automatically captures users’ implicit feedback information, based on which, PAIR can infer users’ search goal and provide personalized search services in both Chinese and English, as an agent of the popular search engine Google and Baidu.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/7634
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
10001_200428015029019吕元华_paper.pdf(1133KB)----限制开放-- 联系获取全文

Recommended Citation:
吕元华. 基于隐式反馈的个性化信息检索技术研究[D]. 软件研究所. 中国科学院软件研究所. 2007-06-07.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[吕元华]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[吕元华]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace