Title: | 信息发现与检索技术中的一种在线学习算法 |
Author: | 曾伟民
|
Issued Date: | 2000
|
Major: | 计算机应用技术
|
Degree Grantor: | 中国科学院软件研究所
|
Place of Degree Grantor: | 中国科学院软件研究所
|
Degree Level: | 博士
|
Keyword: | 搜索引擎
; 文档索引
; 检索模型
; 在线学习
|
Abstract: | 本文主要介绍了如何将信息发现与检索技术应用于WWW环境,使得Web用户能快速而准确地找到所需信息。搜索引擎主要分两部分,一是文档发现,另一是文档检索。文档发现的过程包括文档收集和文档索引,文档检索的关键则在于建立文档检索模型。我们介绍了多种文档检索模型,如布尔逻辑模型、向量空间模型和概率模型等,并提出了一种新的检索模型:文档覆盖模型。根据这些技术,我们设计并实现了一个搜索引擎原型。现有的搜索引擎系统的检索精确度都不高,为此,我们设计并实现了一种在线学习算法,通过不断地学习用户的反馈信息,逐步求精,缩小检索结果集,提高检索的精确度。另外,我们还将此在线学习算法应用于个人浏览代理,自动搜寻用户最喜爱的信息。 |
English Abstract: | This article mainly introduces how to applying information retrieval technology to the World Wide Web (WWW) environment, so that Web users are able to find needed information quickly and exactly. Search engine mainly includes two parts. One is document discovery, the other is document search. Document discovery procedure must process document gathering and document indexing. The key of document search is the construction of document retrieval model. We introduce many of document search model, such as Boolean retrieval model, vector space model, probabilistic model, etc. We also present a new retrieval model which is document overlap model. According to these technologies, we designed and implemented a prototype of search engine. The precision of present search engines is not too high, so we designed and implemented an on-line learning algorithm. This algorithm learns the information of users' relevance feedback continuously, refines the result step by step, reduces the result collection and increases the search precision. In addition, we apply this on-line learning algorithm to personal browsing agent to search to user's favorite information automatically. |
Language: | 中文
|
Content Type: | 学位论文
|
URI: | http://ir.iscas.ac.cn/handle/311060/6604
|
Appears in Collections: | 中科院软件所
|
File Name/ File Size |
Content Type |
Version |
Access |
License |
|
LW002129.pdf(1391KB) | -- | -- | 限制开放 | -- | 联系获取全文 |
|
Recommended Citation: |
曾伟民. 信息发现与检索技术中的一种在线学习算法[D]. 中国科学院软件研究所. 中国科学院软件研究所. 2000-01-01.
|
|
|