Font Size: a A A

Research On The Thunderstorm Data Clustering And Thunderstorm Prediction Model Based On The Hadoop Platform

Posted on:2015-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:F JiFull Text:PDF
GTID:2180330467483302Subject:Meteorological information technology and security
Abstract/Summary:PDF Full Text Request
The meteorological industry itself has a mass of meteorological data. When we want to predict the weather or the climate, we will have a lot of computing. And with the rapid development of science and technology, people entered the era of the big data. The characteristic of meteorological data is huge and complex. Along with the development of new types of radars and more precise weather-resolution techniques, the amount of the meteorological data is becoming more and more huger, and there exist more and more data types. The traditional data mining techniques and methods have become difficult to meet the storage and processing of meteorological data. This integration can make up some of the deficiencies that exist at the meteorological industry. The emergence of cloud computing provides a new way for processing the meteorological data.Thunderstorm is a severe weather phenomena, it causes a serious threat for the people’s daily life. Therefore, there exists important significance for the prediction of the thunderstorm, as And take some actions to prevent the thunderstorm disaster. Based on the NCEP reanalysis data set, Jiangsu provinces Lighting data and the thunderstorm disaster news, this paper has done following works:(1)The Thundercloud platform is constructed by the Hadoop cluster. This paper proposed MRKM (MapReduce K-means) algorithm to cluster the thunderstorm news. This algorithm is realized by two map function, one combiner function and one reduce function. After the experiment the thunderstorm news has been clustered into four clusters, we analysis the key words of each cluster to know the distribution of the thunderstorm and throw out some suggestions to prevent the thunderstorm disasters.(2)Proposed a new algorithm called MRNB (MapReduce Naive Bayes). This algorithm is composed by three MapReduce Job. In order to test the new algorithm’s availability, we compared it with the traditional weather forecasting method, the Fisher discriminant analysis. The experiments show that the MRNB algorithm has higher accurate rate, higher CSI score and lower FAR rate than the Fisher discriminant analysis.
Keywords/Search Tags:Data Mining, Naive Bayes, K-means, Hadoop, MapReduce, Thunderstorm
PDF Full Text Request
Related items