Font Size: a A A

Two Optimal Methods For Parameter Selection In Support Vector Machines For Unbalanced Data Sets

Posted on:2008-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:X D MuFull Text:PDF
GTID:2120360218455230Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Support vector machine is a new approach of data mining based on the statistical learning theory and mathematical programming. Mathematical progamming is an important branch of operational research .It has been extensively applied to areas of machine learning, networks problem and mechanics. Especially, combining it with data mining makes it possible to solve large-scale and complicated problems and it has also been successfully applied to feature selection, clustering and regression. Support vector machine is one of the important results of applying mathematical programming to data mining and it is a machine learning method that was brought out by V.Vapnik according to statistic theory.This paper mainly studies the optimal methods for parameter selection in support vector machines for unbalanced data sets. Support vector machine has been used in various fields and has obtained good effects; the parameter selection in support vector machine is an important research direction, different parameters result in different generalization; The studies of parameter selection in support vector machine for unbalanced data sets is fewer. For unbalanced data sets, this paper presents the parameter selection model in support vector machine and algorithm, do the numerical experiments.This paper presents two optimal models for parameter selection in support vector machine for unbalanced data sets. Firstly, for unbalanced data sets, traditional quality measure is abandoned, and F-measure is presented which is suited for unbalanced data sets. By minimizing F-measure, a parameter selecting model is built which is solved by SVMlight. Numerical experiments results show that this model is effect for choosing the cost parameter. Secondly, the new model is formulated in the form of one of MPEC problems with smooth objective function. It is a nonlinear problem which has smoothed objective function and with complementary constraints. Numerical experiments are solved by Lingo and the results show the effect initially.
Keywords/Search Tags:Data Mining, Support Vector Machine, Unbalanced Data Sets, Quality Measure, Parameter Selection
PDF Full Text Request
Related items