Font Size: a A A

The Research Of Meteorological Data Mining Based On Hadoop

Posted on:2022-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:P Q TaiFull Text:PDF
GTID:2480306566974849Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The scientific and economic value contained in the ever-expanding meteorological data is very huge,and the degree of meteorological information is also constantly improving.The traditional data processing methods appear to be more and more inadequate,and the emergence of cloud computing provides a new idea for the analysis of massive meteorological data.Apache Hadoop is derived from Google cloud computing platform.Its distributed file storage system HDFS and distributed computing model Map Reduce can well meet the storage and computing requirements of massive meteorological data.Therefore,it is necessary to find an appropriate data mining algorithm to combine with Hadoop technology and apply it to the field of meteorological data mining.It has become a hot topic in research.In this paper,the Hadoop technology of cloud computing and the improved Bayesian network were combined to design the weather prediction classification method under Hadoop,and the Hadoop cluster was built through the virtual machine.The experiment verified the feasibility of applying Hadoop technology to weather data mining and weather prediction,and the performance analysis was made.The main contents include:(1)The cluster architecture and working principle of HDFS distributed file storage system and Map Reduce programming model are analyzed.(2)According to the correlation of attribute variables and the discreteness of data in meteorological data set,a Bayesian network classifier based on prediction ability is proposed,which solves the problem that naive Bayesian classifier is not accurate when dealing with high correlation variables.(3)Combined with the characteristics of meteorological data mining,the specific implementation of Bayesian classifier based on prediction ability under Hadoop platform is given,including the implementation of Map Reduce in the pre-processing stage,correlation analysis stage,classifier construction stage and precision evaluation stage.(4)By building a Hadoop cluster on the virtual machine,the data from the training set were used for the training of the predictive Bayesian classifier,and the directed acyclic graph obtained from the training clearly and accurately expressed the dependencies among the attribute variables in the meteorological data set;Then the data in the test set are predicted by the classification model and the results are compared with those of the Naive Bayes classifier.
Keywords/Search Tags:Meteorological data, Hadoop, Data Mining, Bayesian Network
PDF Full Text Request
Related items