中国科学院软件研究所机构知识库
Advanced  
ISCAS OpenIR  > 中科院软件所  > 中科院软件所
题名:
基于Softbot的Web信息获取方法研究
作者: 刘瑞虹
答辩日期: 1998
专业: 计算机软件
授予单位: 中国科学院软件研究所
授予地点: 中国科学院软件研究所
学位: 博士
关键词: Web信息获取 ; 多Softbot体系结构(MSA) ; We信息获取模式 ; 通讯原语 ; Web信息领域分析 ; Web信息主题 ; 企业Intranet ; V群用户
摘要: 自从1990年12月世界上第一个Web软件在Steven Job的NeXT计算机系统上诞生以来,Web技术及其应用在世界范围内以惊人的速度迅速扩展,现在其已渗透到了工作生活的各个领域。面对如此众多的Web服务器与其上面丰富的Web信息资源,如何有效快捷地进行Web信息获取变的越来越重要。特别是随着Intranet日益的广泛应用,出现了以企业员工为主体的用户群,他们在Internet上形成了以企业为单位的集体行为操作特征。而已有的Web信息获取软件工具还不能充分满足企业用户群的Web信息获取需求。为了从根本上解决此问题,必须发展针对企业Intranet 的Web信息获取机制。在Internet/Intranet环境下,进行有效的Web信息的获取是当今软件技术一个重要研究方向。本文结合所承担的国家“九五”重点科技攻关计划(96-743-01-01-05)专题“网络信息获取前后服务处理技术”和“金桥”工程项目的“Web信息获取系统”开发任务,采用Softbot(软件机器人)技术,对基于企业Intranet的Web信息获取方法进行了研究。论文的主要工作包括以下几个方面:1. 分析了现有Web信息获取系统采用的技术和服务现象。认为Web信息获取从技术上经历了分类目录方法、搜索引擎方法、元搜索引擎方法的演化过程;从服务对象上经历了个体用户向区域用户,群体用户的发展。然后从Web信息获取的技术方法和服务对象两个方面探讨了Web信息获取的新发展方向,即智能化的Web信息获取和Intranet范围的群体用户。2. 综述了Web信息特征和Web信息获取特征。在此基础上,给出了现有Web信息获取的概念模式分类和用户分类。目前存在的主要Web信息获取系统都是面向个体用户的,信息内容也趋向于一般化,由于Web信息呈指增长,因此建立一个包括全部Web信息内容的搜索引擎几乎是不可能的。为了克服这些困难,Web信息获取向分专业方向发展是一种趋势。而企业Intranet的迅速发展,使得面向Intranet的Web信息获取变的日趋重要。3. 对Softbot进行了明确的定义。对Softbot的组成、功能和基本结构进行了研究,采用BNF对Softbot结构进行了详细的定义,详细分析描述了Softbot的内部运行机制,在此基础上对多Softbot体系结构(MSA)的特点、分类和交互进行了研究,给出了用通讯原语(CP)和黑板结构进行Softbot之间的信息交流的具体方式。4. 提出了基于Intranet的Web信息获取模式和Web信息获取模型IBMWIR。基于Intranet的Web信息获取模式的核心是把企业业务活动所需要的Web信息动态收集在Intranet内的服务器上。同时给出了采用多Softbot体系结构(MSA)构造设计的基于Intranet的Web信息获取模型IBMWIR。在IBMWIR中,探讨了根据Web信息主题词集,利用记者类Softbot的派驻,驻留等手段对Web信息数据库内容进行实时更新的运行机制。5. 提出了Web信息领域分析的概念。Web信息领域分析就是得到企业的Web信息主题(WIS)的过程,其在进行企业级Web信息获取的过程中起着非常重要的作用。并给出了企业Web信息领域分析的三种方法,即基于业务角色的Web信息领域分析方法、基于业务处理的Web信息领域分析方法和基于业务主题的Web信息领域分析方法。这三种方法分别从空间角度、时间角度和主题角度描述了企业在业务活动中对Web信息源的信息需求,较全面地覆盖了企业的业务活动特征。这些Web信息领域分析方法紧密联系理论和实践,具有较强的可操作性。并对WIS的管理机制进行了深入的探讨。6. 基于IBMWIR,设计开发了一个基于企业Intranet的Web信息获取原型系统WebCapture。WebCapture由客户软件和服务器软件两大部分组成。利用WebCaputre,Intranet范围内的用户可以进行Web信息获取操作。采用Softbot结构可以提高系统的灵活处理能力,另外对于系统的功能扩充也有明显的优势。从WebCaputre的运行实例可以看出,构造Intranet的Web信息获取系统可以满足特定领域群体用户的Web信息需求。
英文摘要: Since December 1990, the birth of the first WWW software on the NeXT system, the technologies and applications of WWW have gone a tremendous increase in the world, and affected almost field of work and daily life. With so many WWW servers and so much Web information on Internet, how to effectively deal with the information searching becomes more and more important than before. Following the development of Internet and WWW, Intranet's implementation within the enterprises is becoming popular. The Web information searching actions based on Intranet own some new characteristics. These actions are not individual behaviors, but group behaviors, that is, the actions between a group of users in an enterprise and Internet. The existing Web search tools can't fully satisfy the requirement of these group users. To solve these problems completely, it is necessary to design a new Web information searching mechanism, which can be used to set up the Web information search system for Intranet. Developing effective Web search tools for Internet/Intranet has been the research emphasis of the software technology recently. As part of the key project of National 9~(th) Five Year Plan, The pre- and post-processing technology for Web information search, and also the task of developing the Web information search system for the "Golden Bridge" project, this paper presents the research of the Web information search methodology for Intranet, by using softbot(software robot)technology. The main contains of this paper involves the following aspects. 1. The technology of the existing Web search systems From the technical point, the existing Web search systems have gone through the browser directed surfing, Web search trees, Web search indexes and meta-search engine. From the service user point, they have focused on from the individual users to geographical regional users and subject-specific users. This paper believes that the further development of Web search tools will be more intelligent and subject-specific group users. 2. analysis of Status quo By now, the existing Web search systems are mainly oriented to individual users, the contents they supplied are mostly popular porpoise. Because of the sharply increase of Web information on Internet, it is impossible to setup a Web search tool that stores all the Web information. To overcome this problem, building the subject-specific Web search systems is a trend. With the popularity of Internets, developing the Internet-oriented Web search system becomes more necessary. 3. Definition of softbot This paper gives out the deep study of the softbot's components, functionality and structure. Using BNF, the softbot's architecture has been defined. This paper also describes the running mechanism of softbot. based on these, a multi-softbot architecture (MSA) is presented. The characteristics, classification and interaction of MSA are analyzed. The communication primitive (CP) is used to exchange the message among the fotbots. 4. Intranet based Web information retrieval Mode and Intranet based model of Web information retrieval (IBMWIR) Intranet based Web information retrieval Mode is a typical mode used to describe how the Web information is retrieved and stored in Intranet server to meet the requirement of the enterprise's activity. Based on this, by using MSA, Intranet based model of Web information retrieval (IBMWIR) is designed. Within IBMWIR, the Web information subject word set is used to define the scope of the Web information search. The journalist softbots can be dispatched or installed on remote servers in order to find Web information quickly and efficiently. 5. Web information domain analysis The purpose of Web information domain analysis (WIDA) is to achieve the enterprise's Web information subject. It is important for the Intranet oriented Web information search to put WIDA. Here gives three methods of the WIDA, That is, the business role based WIDA, business processing based WIDA and business subject based WIDA. These methods try to separately describe the Web information requirements of the business activities within the enterprise from the spatial, temporal and subject views. 6. WebCapture, an Intranet oriented Web information search prototype system Based on IBMWIR, WebCapture has been designed and developed. In WebCapture, users can search Web information quickly, because WebCapture has pre-fetched the Web information for the users according to the Web information subject word set. WebCapture shows the efficiency and convenience of the Web search actions. It is also verified the validation of IBMWIR. From the running instances, it proves that the Intranet oriented Web information search system can well satisfied the requirements of the Intranet's group users.
语种: 中文
内容类型: 学位论文
URI标识: http://ir.iscas.ac.cn/handle/311060/7532
Appears in Collections:中科院软件所

Files in This Item:
File Name/ File Size Content Type Version Access License
LW002839.pdf(2724KB)----限制开放-- 联系获取全文

Recommended Citation:
刘瑞虹. 基于Softbot的Web信息获取方法研究[D]. 中国科学院软件研究所. 中国科学院软件研究所. 1998-01-01.
Service
Recommend this item
Sava as my favorate item
Show this item's statistics
Export Endnote File
Google Scholar
Similar articles in Google Scholar
[刘瑞虹]'s Articles
CSDL cross search
Similar articles in CSDL Cross Search
[刘瑞虹]‘s Articles
Related Copyright Policies
Null
Social Bookmarking
Add to CiteULike Add to Connotea Add to Del.icio.us Add to Digg Add to Reddit
所有评论 (0)
暂无评论
 
评注功能仅针对注册用户开放,请您登录
您对该条目有什么异议,请填写以下表单,管理员会尽快联系您。
内 容:
Email:  *
单位:
验证码:   刷新
您在IR的使用过程中有什么好的想法或者建议可以反馈给我们。
标 题:
 *
内 容:
Email:  *
验证码:   刷新

Items in IR are protected by copyright, with all rights reserved, unless otherwise indicated.

 

 

Valid XHTML 1.0!
Copyright © 2007-2017  中国科学院软件研究所 - Feedback
Powered by CSpace