Font Size: a A A

Model Selection Of Support Vector Machines In The Unbalanced Sets

Posted on:2008-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:C K YaoFull Text:PDF
GTID:2120360215453850Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Support Vector Machines (SVM) was proposed by Vapnik et.al. in 1990's. SVM is a new and outstanding learning machine and is an efficient machine-learning tool in dealing with small samples. SVM has been widely applied to many areas, such as pattern recognition, signal processing, automation, communication, etc. In the unbalanced sets, the difference of the sample quantities of the different classes leads to declining performances of many classifiers. So the unbalanced sets is always the research hotspot in machine learning .Seeking for the optimal hyperparameters with the unbalanced sets is one of the most important branches of SVM and often named as model selection.In this thesis, firstly the basic theories of statistical learning theory and the research status on the unbalanced sets are reviewed. Secondly, a large number of experiments employing L1-SVM according to an existed and wildly accepted theory of adjusting the ratio of two penalties utilizing sample quantities are made and the experimental results show that the existed theory doesn't have the validity in some conditions. Thirdly, the classical L2-SVM with one penalty is generalized and the dual formulation of L2-SVM with two penalties is inferred, which lead to the two penalties appearing at the diagonal line of the kernel matrix. So the two penalties and the kernel parameter are all included in the kernel function in order to establish the optimal objective function. Fourthly, a new algorithm which employing L2-SVM with two penalties is founded on minimizing the VC dimension and combined with the optimization methods to obtain the optimal hyperparameters with the unbalanced sets is proposed .Simulation tests with the constructed two-class unbalanced sets derived from multi-class benchmarks according to the new algorithm presented in this thesis are executed and the experimental outcomes display the outstanding performance and the feasibility of this new strategy. This new algorithm is also applied to multi-class sets combined with one-against-all method and the experimental results show that the accuracy of classifying is improved wonderfully. Finally, a summary of this thesis is given and the future research directions are also pointed out.
Keywords/Search Tags:Statistical learning theory, VC dimension, Support vector machines, Model selection
PDF Full Text Request
Related items