Font Size: a A A

Analyzing And Prediction Of Imbalanced Weather Data Based On Branch-and-Bound Algorithm

Posted on:2017-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:J H WangFull Text:PDF
GTID:2180330485469648Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Weather condition is one of the important environmental conditions of human social activities, which will not only directly affect the production, construction and management of various sectors of the national economy, but also affect people’s daily lives. It can be said that the meteorological analysis is a matter of the national economy and the people’s livelihood. With the advent of the era of big data, the rapid development of the data mining technology has been promoted. Among them, with the strong explanations and higher classification accuracy, classification method based on association rules has become a research hotspot in the field of intelligent decision making. It has been a concern of domestic and foreign researchers that how to use meteorological data to explore the meteorological law, so as to better understand the formation and prediction of the weather.Weather meteorological data analysis is two-class classification of unbalanced datasets, and people are more concerned about the prediction of the rainy day. Traditional two-class classification data mining methods are mostly based on the assumption that the basic balance of positive and negative samples. If they are directly applied to the imbalanced data sets, often cannot achieve satisfactory results. In addition, the traditional association rule mining method is based on the attribute granularity, and the granularity of the association rules is not fine. With large-scale meteorological data sets, how to deal with the characteristics of data imbalance, to build a more refined, more parallel classification prediction model is a scientific problem in meteorological analysis and mining.According to the characteristics of weather data, this paper proposes a modified method based on cost-sensitive learning, with unit time rainfall as the cost of learning value. The data will be reasonable and effective area divided into rain and rain classes. Attempts by ’desecrating’and binary coding the attribute data values, the dimension of data is to be smaller size. Then try to use a logical association rules based on boundary delimitation of mining methods, use of OCAT principle, iterative training on the encoded data sets, thus obtains the association rule based classifier, and analysis the method of improving the performance of the algorithm and the possibility of parallel computation. The key parameters involved in the algorithm is also analyzed.Results of experiments show that the logic of classification based on association rules model results are intuitive and easy to understand. What’s more, it preforms good with weather data. This classification and prediction model has higher accuracy and stability, and easy to implement parallel computing, can better to meet the current analysis of Meteorological real-time calculation and analysis requirements. Because the classifier is based on mathematical logic, the classification model can achieve the model optimization based on the needs of its further logic operation. The model provides a method for the analysis of meteorological data to make up the deficiency of the traditional method in the process of unbalanced weather data processing.
Keywords/Search Tags:Weather, Imbalance, Cost-sensitive, Logic, Branch-and-Bound, Classification
PDF Full Text Request
Related items