| With constant progress of network technology, the weather forecast in our daily life,occupies a more and more important position. This is because the weather is about a lot ofthings for everyday life, And for people engaged in operation on sea and in coastal areas,can through the weather forecast know whether there is a typhoon or heavy rain in thefuture, and reduce the unnecessary loss.In that way, how to efficiently obtain and use theweather data can become a great challenge. In order to meet the requirements of such, webcrawler arises at the historic moment. Web crawler is actually refers to the fulfilment ofthe wishes of the user to humans cannot reach the speed of uninterrupted softwareprogram to perform a task.First, this paper introduces the origin of web crawler, development history, workingprinciple and application field. Through the analysis of traditional mainstream webcrawler, found that the traditional web crawler is how to carry out the network dataextraction.Second, Mainly implements the web crawler consists of six modules, respectively,the initial seed collection module; crawl module; content analysis module; the dataprocessing module; data capture module; data storage module. With the traditional webcrawler, the use of open China weather network API interface for data capture, by usingthe depth-first search strategy, to analyze China weather network of weather data andextract, and combines web crawler and backtracking algorithm is very good. The use ofbacktracking algorithm is effective for web crawler application fails to grab the weatherdata and fetching the data for a null value, etc., has carried on the exception handling, theuse of this algorithm, not only greatly optimize the function of web crawler, grasp theweather data and greatly improving the efficiency and accuracy.Finally, using the web crawler application implements open interface of the weatherdata capture, and use the network data extraction methods will weather data to parse andextract, and then save it to the MYSQL database, so that in the future for data mining andthe historical research of the weather. |