Font Size: a A A

The Web Data Mining Base On BP Neural Network

Posted on:2011-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:W H GaoFull Text:PDF
GTID:2178330338478366Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
WEB text mining is a direction of WEB data mining. In the process of WEB text mining the most important is how to effectively classify the document content, the higher the precision of text classification, the more WEB query results will close to the ideal requirements, the more it will satisfy users in practice, so it is very important to design a efficient classify algorithm. In a number of classification algorithms, BP neural network algorithm has been widely used because of four reasons: it can change I/O problem of a group of samples into a non-linear problem; it can realize I/O non-linear mapping; universal approximation network; generalization ability and other characteristics. In this paper, we introduced WEB development and application of data mining and discussed relevant content of WEB data mining and algorithm used, based on the content of an overview of the previous work, involved to the WEB documents classification-related algorithms in the process of mining content on the WEB was studied, the BP neural network algorithm was discussed in-depth, and on this basis we proposed an improved BP neural network algorithm. The algorithm was used in multi-subnet topology parallel structure of document content on the WEB classification, experimental results show that the algorithm performance has been improved significantly. Specific research content includes following aspects:(1) Paper describes the research background and significance. The concept of data mining, data mining classification, the use of mining algorithms, the process of data mining was discussed in detail.(2) In this paper, the contents of the WEB data mining, classification of WEB data mining and WEB data mining process were described, especially for the contents of WEB content mining, process and related algorithms. In the process of web content mining, the most important thing is to classify the content of the document.(3) In this paper, the content of neural network algorithm was described, especially for the BP neural network algorithm. The concept, principle, topological structure, the advantage and weaknesses and other characteristics of the algorithm was discussed in detail. The low convergence rate and large error of the algorithm was improved by proposing an advanced BP neural network which increase the convergence rate and reduce errors.(4) Multi-subnet parallel neural network algorithm is a method to enhance the performance of BP neural network algorithm by optimizing the ordinary three-layer BP neural network algorithm topology. In this paper we combined the improved BP neural network algorithm and multi-subnet parallel topologies to make the classification ability algorithm improve greatly. Experimental results show that the use of multi-subnet parallel BP algorithm can improve the ability of the WEB document classification significantly.
Keywords/Search Tags:Data Mining, BP neural Network, Web Content Mining, Web Document classification, Topology structure
PDF Full Text Request
Related items