ISCAS OpenIR  > 中科院软件所  > 中科院软件所
基于隐式反馈的个性化信息检索技术研究
其他题名A Study of Personalized Information Retrieval based on Implicit Feedback
吕元华
2007-06-07
学位授予单位中国科学院软件研究所
学位博士
学位授予地点软件研究所
关键词个性化信息检索 隐式反馈 Hits算法 查询扩展 重排序
摘要搜索引擎是互联网用户最常用的信息查询工具。目前主流的搜索引擎并没有明确区分不同用户的查询意图,而不同用户即使输入相同的查询词,其查询需求也是有差别的。个性化信息检索技术正是解决这一问题的关键技术之一。 本文在对现有的国内外个性化信息检索技术研究现状进行分析和综述的基础上提出了一种基于隐式反馈信息的迭代个性化检索算法,并实现了一个基于客户端的个性化检索工具。本文的主要工作有: 第一, 对目前的个性化信息检索技术进行了较为全面、深入的综述。根据所采用的个性化信息和个性化检索的实现方式,本文对个性化信息检索研究工作进行了分类和探讨,并对一些有代表性的工作进行了介绍和分析。 第二, 提出了一种基于隐式反馈信息的迭代个性化检索算法。基于词和文档之间的相互加强关系(相关文档总是包含许多与查询相关的词,而相关的词总是出现在许多相关文档中),本文提出了一种类似HITS的迭代算法用来计算词和文档的权值,并根据词的权值进行查询扩展,根据文档的权值进行重排序。在迭代算法基础上,利用查询扩展来丰富结果文档,然后通过重排序把文档推荐给用户。实验结果表明,本文提出的个性化检索算法能够有效地提高检索精度。 第三, 基于该算法,我们设计并实现了客户端个性化检索工具PAIR。PAIR可以自动地记录用户的隐式反馈信息,分析和推测用户的兴趣需求,并基于Google和百度的查询结果,以Internet Explorer工具条的形式向用户提供中文和英文的个性化检索服务。
其他摘要Analysis suggests that, while search engines do a good job of retrieving results to satisfy the range of intentions people may associate with a query, they do not do a very good job of discerning an individual’s unique search goal. To overcome the problems, there have been many attempts to improve retrieval accuracy based on per-sonalized information retrieval technology. We begin this thesis with an in-depth survey on the existing studies concerning personalized information retrieval. Then we proposed an iterative implicit feedback approach to personalized search, and implemented a client-side personalized search agent. The main work of the thesis includes: First, we made a comprehensive survey on the personalized information retrieval technology. The related approaches are classified according to several features, such as the kind of user information that is used, the algorithm used to incorporate the in-formation to information retrieval techniques, etc. Moreover, some representative studies were introduced and discussed. Secondly, we proposed an iterative implicit feedback approach to personalized search. There is a mutual reinforcement principle between documents and terms, which states that relevant documents contain many relevant terms, and that relevant terms occur in many relevant documents. Based on this principle, we designed a HITS-like iterative algorithm to compute the weight for each search result, and for each term being extracted from the implicit feedback information. As a result, query expansion and result re-ranking could be conducted simultaneously according to these weights. Furthermore, it is worth mentioning that, we changed the fashion of the regular query expansion, so that it would just provide diversified search results which, however, must rely on the use of re-ranking to be moved forward and recommended to the user. Experiments on web search show that the proposed approach can improve search accuracy effectively and efficiently. Thirdly, we implemented a client-side personalized search agent PAIR to carry out and evaluate the proposed approach. PAIR automatically captures users’ implicit feedback information, based on which, PAIR can infer users’ search goal and provide personalized search services in both Chinese and English, as an agent of the popular search engine Google and Baidu.
页数89
语种中文
内容类型学位论文
URI标识http://ir.iscas.ac.cn/handle/311060/7634
专题中科院软件所_中科院软件所
推荐引用方式
GB/T 7714
吕元华. 基于隐式反馈的个性化信息检索技术研究[D]. 软件研究所. 中国科学院软件研究所,2007.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
10001_20042801502901(1133KB) 限制开放--请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[吕元华]的文章
百度学术
百度学术中相似的文章
[吕元华]的文章
必应学术
必应学术中相似的文章
[吕元华]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。