Font Size: a A A

GEV-based Algorithms For Meteorological Data Classification And Clustering

Posted on:2021-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:S D HeFull Text:PDF
GTID:2370330626961114Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The extreme value theory has been widely utilized in many scenarios of machine learning,such as predicting rare events,describing asymmetric decision boundaries,and approximating the distribution of extreme distances,to name a few.In this thesis,we developed two advanced machine learning algorithms based on the extreme value theory.For supervised learning,we propose an imbalanced binary classification algorithm by combining the generalized extreme value distribution with the boosting method.A practical application on the estimation of the short-term rainfall events in arid and seme-arid regions show that this algorithm outperforms in a series of metrics,and the fitted model is also highly interpretable.As for unsupervised learning,we focus on the data clustering problem and aim to improve the classical K-means algorithm by embedding a hypothesis test procedure.This test procedure can help identify outliers as well as assign samples near the decision boundary.Simulation results show that our method can effectively correct misjudged points in the K-means model to achieve better results.Moreover,the test-based K-means algorithm gives reasonable result on the meteorological station clustering problem,which forms a useful reference for related meteorological studies.
Keywords/Search Tags:Binary classification, Clustering, Extreme value theory, Hypothesis test, Imbalanced data, Meteorological Study
PDF Full Text Request
Related items