Font Size: a A A

Kernel Method Under Non-equilibrium Data For Classification

Posted on:2010-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:S X MaFull Text:PDF
GTID:2208360275491845Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Imbalanced dataset classification problem is very common in the real world, such as medical diagnostic, radar image detection, fraud detection and so on. Due to the intrinsic uneven attribute, namely the extraordinary difference between the amount of positive samples and negative samples, it leads to the reduction of the tradition classification algorithm's performance, so how to effectively and accurately classify the imbalanced dataset has become a hot research problem in the machine learning and pattern recognition field.On the basis of tradition kernel method, this paper propose a classification learning algorithm, which integrates a new over-sampling method and the Support Vector Machine with different costs, to achieve the target of improving the imbalanced dataset classification performance. Main works are studied follows:(1) Aim at the imbalance problem of imbalanced dataset, this paper proposes a method of data processing in the image space of kernel method, namely SMOIS (Synthetic Minority Over-sampling in Image Space). This method which is different from the strategy of synthesizing minority samples in the original data space brings in non-repetitive synthetic minority samples in the image space after mapped and thus reducing the sensitive of minority sample of classification algorithm. The experiment results show that this method has a better classification performance according to the evaluation on roc curve and g-means.(2) Support Vector Machine (SVM) is an effective classification learning algorithm, but usually obtains an unsatisfactory performance in face of the imbalanced dataset. Consequently this paper proposes a new SVM learning algorithm based on the SMOIS to improve the performance of classification, which integrate the SMOIS method and revised SVM algorithm.The researches in this paper are the one of currently key problems. It has important theoretical significance, and also has direct application value for real-world problems.
Keywords/Search Tags:Imbalanced Data Classification, Support Vector Machine, Kernel Method
PDF Full Text Request
Related items