Font Size: a A A

Research On Retrieval Algorithm Of Precipitation Phase State Based On Unbalanced Data

Posted on:2022-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:C C JiangFull Text:PDF
GTID:2480306731953519Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The precipitation phase is the pattern of surface precipitation,which can be classified into various types such as rain,snow,freezing rain and sleet.Traditionally,observations are mainly made manually at observation stations,and it is difficult to obtain actual weather conditions in areas where there is no ground-based observation.However,surface precipitation patterns have a significant impact on human life and production,such as freezing rain in unpopulated areas,which causes power lines to freeze.This is one of the most important meteorological problems that needs to be solved in the absence of ground-based observations to obtain accurate precipitation phases.Precipitation patterns are influenced by high altitude temperature,humidity,wind,wind direction and other contours,and the causes are complex.In this paper,the inversion of the precipitation phase pattern using the situation field data of the meteorological elements in the middle and upper air from the numerical model forecasts is used to solve the problem of obtaining the precipitation phase pattern live in any region.This paper addresses the intrinsic correlation between precipitation phase inversion and meteorological elements,combines the special evidence of precipitation phase data,optimizes the data imbalance and designs a machine learning inversion model applicable to this problem in order to improve the accuracy of various types of precipitation phase inversions.The main research elements are as follows.(1)In the study of precipitation phase inversion,it is generally limited by the limitations of the data.In this paper,a precipitation phase data set is constructed and more than 350,000 individual examples of precipitation phases are introduced,improving the impact of the data on the study compared to previous authors.At this data scale,the imbalances within the data are more pronounced and the distribution of features between classes is more complex.Better suppression of data imbalance is needed for extreme data imbalance problems,extraction of data features and inversion results.(2)In this paper,a new hybrid sampling algorithm(ADASYNBorderline Tomek Links,A-BTL)is proposed in data pre-processing,which integrates the ADASYN undersampling algorithm and Borderline Tomek Links oversampling algorithm to improve the data imbalance.(3)In this paper,probability calibration and integrated learning are fused in the classification algorithm,and Probability Calibration Random Forest(PCCRF)is proposed to correct the deviation of the probability after classification by Isotonic calibration,which increases the accuracy of this algorithm and makes the overall classification of the algorithm better.The algorithm's overall classification performance is improved.In this paper,the algorithm has improved in terms of the size of the dataset and in terms of F1,G-mean,AUC and other metrics after experimental comparison.
Keywords/Search Tags:precipitation phase pattern, machine learning, imbalance, hybrid sampling method, random forest
PDF Full Text Request
Related items