WebDMS：基于Web的文档管理系统

ISCAS OpenIR > 中科院软件所 > 中科院软件所

	WebDMS：基于Web的文档管理系统
	温福才
专业	计算机软件和理论
	2000
学位授予单位	中国科学院软件研究所
学位	博士
学位授予地点	中国科学院软件研究所
关键词	Web的文档管理系统文本-图象映射信息抽取自适应内容发送 Web技术
摘要	随着Web技术的日益成熟而越来越受到人们的欢迎，传统的文档管理系统（DMS）适应这一趋势而需要体系结构上的改变，这样就产生了基于Web的文档管理系统的概念。此外，传统的DMS所管理的文档大量是纸文档的扫描图象；在基于Web的DMS系统环境里，当图象文档显示在客户端的时候，为了方便用户，必须像HTML页面那样，允许用户进行“导航”。同时，由于HTML文档的大量产生，入库的HTML文档也越来越多。为了对这些HTML文档上的不规则动态信息按照数据库的方式集成和查询，必须抽取页面上的信息，生成类似于XML的结构，以便进行高效的检索。考虑到客户端设备的能力、网络带宽和用户偏好，在传输多媒体文档信息的时候，必须进行内容改编，以适应上述特定情况。本文在总结、分析传统的文档管理系统的基础上，指出了它们所存在的上述问题，并提出了我们的解决方案，这就是我们的原型系统WebDMS-基于Web的文档管理系统。关于图象“导航”，我们采用文本-图象映射的方法，解决了这个问题。关于HTML文档结构信息抽取的问题，我们采用了一种启发式规则和数据抽取格式相结合的抽取算法进行了解决。关于内容改编发送的问题，我们采用了自适应内容发送，其框架包括：内容改编算法、客户端能力和网络带宽发现方法及决定引擎。三个子系统基本上是独立进行的，系统可移植性、可扩充性良好。
其他摘要	With the maturity and popularity of Web technology, it is required for the traditional Document Management System (DMS) to change in architecture to adapt to the trend, resulting in the concept of Web-based Document Management System. In addition, many of what the traditional DMS manages are images that are scanned from paper media; in Web-based DMS system environment, image document displayed on client device should, as HTML page, allow users to navigate for the convenience of users. With the proliferation of document in HTML format at the same time, many HTML documents would be archived accordingly. In order to integrate and query irregular and dynamic information on these pages in a database-like fashion, the structure information need to be extracting to improve the query performance, generating XML-like structure. Taking into account the capability of client device, network bandwidth and user preference, it's required to adapt the content of the multimedia document when delivering. Based upon the summarization and analysis of the traditional DMS', this paper points out the above-mentioned problems and provides our solution to these problems, which is the prototype system WebDMS. We resolve the problem of image "navigation" by adopting the method of text-image mapping. As to the problem of structure information extraction of HTML document, we resolve it by utilizing an extraction algorithm that combines heuristics rules with the format description of data extraction. The problem of document content adaptation and delivery is resolved by adopting the adaptive content delivery, for which the framework includes content adaptation algorithm, client capability and network bandwidth discovery methods, and a Decision Engine for determining when and how to adapt content. These subsystems are independent so that the system is of high portability and expandability.
页数	51
语种	中文
内容类型	学位论文
URI标识	http://ir.iscas.ac.cn/handle/311060/6696
专题	中科院软件所_中科院软件所
推荐引用方式 GB/T 7714	温福才. WebDMS：基于Web的文档管理系统[D]. 中国科学院软件研究所. 中国科学院软件研究所,2000.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
LW002156.pdf（2345KB）			限制开放	--	请求全文