| By rapid progress of Internet, it has been the densest and abundant information source.Then finding the information from large data that the users can be interested in has beenattracting more and more attention. Web mining is an effective technology of extracting usefulpatterns and information. XML can transport structural data because it is extensible,structural,effective. So the combination of XML and Web mining has been the solution ofextracting information. First, we start with the studying background of Web mining and introduce thecorresponding conception of data mining and Web mining. We also present that XML issuperior to HTML. Second, We expatiate how to implement Web content mining and develop a Web miningtechnology based on HTML,semi-structured data model,XML,Java. We transform Webpage to XML document and extract useful information from XML by selecting reliant datasource and anchor. Last, we study the problem of mining structural data. We use labeled,ordered trees asdata model and present a method of mining frequent induced subtrees from ordered trees tohelp people acquire the useful information. |