Institutional Repository
| XML文档的有效性验证和查询实现 | |
| Alternative Title | Research on the Implementation of XML Validation and Query |
| 戴蓓洁 | |
| 2007-06-03 | |
| Degree Grantor | 中国科学院软件研究所 |
| Degree Level | 博士 |
| Place of Degree Grantor | 软件研究所 |
| Keyword | Xml处理器 有效性验证 Xpath Xml查询 性能测试 |
| English Abstract | XML(eXtensible Markup Language)是W3C定义的一种标记语言,目前已被广泛用于电子商务、B2B通信、企业信息集成和Web服务等应用中,成为网络环境下组织、存储和交换信息的基本方式之一。随着XML应用的范围越来越广,对于XML解析的性能要求也越来越高。 本文在已有的ONCE XML Parser的基础上,研究了基于DTD(Document Type Definition)的有效性验证和XML查询语言的特点,实现了支持基于DTD的有效性验证和遵循XML Path Language 1.0规范的文档查询功能的ONCE XML Processor 1.0。在设计上,ONCE XML Processor 1.0采用了轻量级系统架构和有效实用的数据结构和算法,使系统具有良好的可配置性和可扩展性。同时,ONCE XML Processor 1.0在系统结构、实现流程和语言层级等多个方面进行了性能优化,通过采用基于统计规律的策略、优化的自动机实现和合理的资源分配等措施,提高了系统的性能。 ONCE XML Processor 1.0的有效性验证完全通过了W3C提供的XML/API兼容性测试,针对两千多个XML测试文档,我们的测试程序自动地测试ONCE XML Processor 1.0中对有效性验证的处理是否符合XML规范。基于SUN提供的XML Test 1.1测试包,ONCE XML Processor 1.0中的有效性验证性能比Xerces2.9.0和Woodstox3.2.0平均高出40%左右。同时,ONCE XML Processor 1.0的文档查询实现也通过了规范的功能正确性测试,性能较Xalan-J-2.7.0均快2倍以上。这说明,ONCE XML Processor 1.0在保证功能完整的情况下,还具有高效的XML文档处理性能。 |
| Abstract | XML(eXtensible Markup Language) is a markup language recommended by W3C (World Wide Web Consortium), which is widely used in many situations such as E-business, B2B communication, enterprise information integration, Web services and so on. XML has become one of the fundamental methods of organizing, storing and exchanging information in the network environment. With the increasing of XML applications, the parsing performance has turned into the challenge of most XML processors. This thesis describes the design and implementation of ONCE XML Processor 1.0 based on our earlier ONCE XML Parser. After analyzing the characteristics in DTD-based validation and XML query languages, we implement validity constraints and provide the query APIs conforming to XPath (XML Path Language 1.0 specification). ONCE XML Processor 1.0 adopts light-weighted system architecture and realizes effective data structures and algorithms, which make the system configurable and extensible. We also made great efforts in optimizing ONCE XML Processor 1.0’s performance by a series of strategies such as statistics-based implementation, optimized implementation of automaton, reasonable allocation of resources, as well as some useful performance improvements on the programming language level. Validation module of ONCE XML Processor 1.0 has passed all the conformance tests provided by W3C. Our testing suites can automatically test more than 2,000 conformance testing cases, and the results show that ONCE XML Processor 1.0 totally conforms to XML specification. Meanwhile we also leverage the XML Test 1.1 from SUN to test the performance of our validation module, as well as the other popular XML processors: Xerces 2.9.0 and Woodstox 3.2.0. The results show that ONCE XML Processor 1.0’s performance on validation is about 40% higher than that of Xerces and Woodstox. On the other hand, we have also tested the XPath module of ONCE XML Processor 1.0, the result shows that it passes all the functional tests and the performance on XML query is about more than twice higher than that of Xalan. Therefore, the design and implementation of ONCE XML Processor 1.0 is effective, together with its function integrity. |
| Pages | 72 |
| Language | 中文 |
| Content Type | 学位论文 |
| URI | http://ir.iscas.ac.cn/handle/311060/5908 |
| Collection | 中科院软件所_中科院软件所 |
| Recommended Citation GB/T 7714 | 戴蓓洁. XML文档的有效性验证和查询实现[D]. 软件研究所. 中国科学院软件研究所,2007. |
| Files in This Item: | ||||||
| File Name/Size | DocType | Version | Access | License | ||
| 10001_20042801502901(1316KB) | 限制开放 | -- | Application Full Text | |||
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment