Font Size: a A A

Research Of Data Mining Based On XML Database

Posted on:2007-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z ChenFull Text:PDF
GTID:2178360242961842Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Nowadays, XML is quickly becoming the pervasive standard for data-exchange and represent of a wide variety of data on the Web and elsewhere. At present, most methods for mining XML data require mapping the data to the relational data model and using techniques designed for relational databases to do the mining.Because pre-processing or post-processing which mapping the XML data to relational database is required before mining is performed, these methods are very complex.Hence, there have been increasing demands for efficient methods that extract rules and patterns from XML data.In order to mine the XML documents without pre-processing or post-processing, the mothod that XML data are stored in XML Database designed especially to store and process XML data is considered.Based on XML Datebase, we analyse and combine the traditional data mining technologies and XML relational technologies, then we put forward the method of association rules research based on XML Database. It includes four fundamental steps in the method: various type data are converted to XML documents which have been legal defined in structure, content and semantics, using DOM technology and Schema technology; any XML document can be stored in XML Database and be mined for association rules by implementing the well-known mining algorithm directly using only XQuery without any pre-processing or post-processing; the association rules, which are extracted from XML documents, are stored in XML database and displayed using XSL.The mining process is the core of the whole method.A group of XML documents cann't be mined by implementing the existing mining algorithms using XQuery.In order to sovle the problem,based on the partition theory, the well-known partition mining algorithm is adopted. The local large itemsets are generated from every XML document, and then these large itemsets are merged to generate the global candidate itemsets, finally the global large itemsets are extracted from the global candidate itemsets.In the end, we present an experimentation to demonstrate that extracting association rules from XML documents using the method is feasible.
Keywords/Search Tags:XML Database, Data Mining, Association Rules
PDF Full Text Request
Related items