Font Size: a A A

The Application Of KNN Classification In Unbalanced Data

Posted on:2018-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:R T MeiFull Text:PDF
GTID:2347330536983953Subject:Statistics, application statistics
Abstract/Summary:PDF Full Text Request
The classification problem has always been one of the key and hot issues in the fields of statistics,machine learning and computer.The traditional classification methods have a good prediction effect when dealing with the balanced data,but these methods can not be directly applied to the unbalanced data classification.There are a lot of scholars have done a lot of research on unbalanced data classification that they often encounter in actual life.These research can be divided into two categories: First,in the view of the algorithm,the algorithm were improved to repair the effects of the imbalance data.So that it can better deal with unbalanced data problems.Second,in the view of the data,it was through sampling and other methods to reduce the imbalance of the data.KNN is a simple,easy to understand and implement algorithm that achieves good classification results in the classification of balanced data sets.In the classification problem of unbalanced data sets,the defects of KNN are obvious,and the influence of the sample distribution will shift the few categories to the majority of categories.In order to solve this problem,we propose a class-weighted KNN method in this paper that is to add a large number of weights to the selected K neighbors to improve the classification accuracy of a few classes.At the data level,divided into m copies,each with a small number of samples constitute a subset of samples,and then KNN classifier on the m sample of training,and finally through the ensemble method into a final classifier.The two methods proposed in this paper have significantly improved the classification accuracy of a small number of classes in the unbalanced data of bank time deposits.
Keywords/Search Tags:The minority class of weighted KNN, Unbalanced data, Ensemble learning
PDF Full Text Request
Related items