| XML (extensible Markup Language) is a widely used markup language to specify kinds of XML documents in many situations such as data transfer in Web environment, data integration, document storage and so on. DOM (Document Object Model) is an XML processing model defined by the W3C organization. Because of its simpleness, well-definedness and independence of specific XML processors, this model has become more and more popular since it was released. Now in industry, there are quite a few XML parsers which have implemented DOM specification. JDK (Java Development Kit) 1.4 also provides specific classes to support DOM in JAXP (Java API for XML Processing).This thesis describes a DOM compatible XML parser: OnceDOMParser, which aims at high performance based on the analysis of the DOM model. Two main rules are followed to improve the performance: reducing the number of small objects in JVM (Java Virtual Machine) and loading data lazily. We designed and implemented a structure named "User Heap" to store small objects to decrease the overhead for JVM to manage a large number of these objects. We adopted a "Compact Storage" mechanism to store multiple collection objects into one array to decrease the number of arrays and the wastage of unused space in collections. Additionlly, we use the lazy-load policy for the frequently used searching operations in DOM, which delays the fetch of data until it is explicitly queried by the user.To test the function of OnceDOMParser, This thesis presents an XML conformance test suite, which is implemented base on the JUnit framework and is able to test the conformance of the parser to the XML specification by checking more than two thousand XML documents. This thesis also presents a DOM conformance test suite to test the conformance of the parser to the DOM specification.At last, the thesis employs XML Test, a benchmark suite for XML processors provided by the Sun Micro System, to test the performance of OnceDOMParser and a wellknown XML parser - Xerces. The benchmark result shows that OnceDOMParser's performance is about 10% better than Xerces. This proves the effectiveness of OnceDOMParser's design and implementation. |