Font Size: a A A

Dual Sparse Support Vector Machine Based On Lasso

Posted on:2020-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Q ZhaoFull Text:PDF
GTID:2417330596482766Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Support vector machine?SVM?,as a new method based on statistical learning theory to solve limited samples in machine learning area,is proposed by Vapnik and others.SVM is used extensively due to its good generalization ability.To reduce the computational complexity,SVM is simplified into Least Squares SVM?LSSVM?.However,this leads to the loss of sparsity in the solution vector which means that nearly all of the samples are treated as support vectors in calculation.Hence,how to construct sparse LSSVM model is especially important.At present,some sparse models with L1 or L2 regularization term have been developed.Sparsity and parameter selection of models are our concerns.The dual sparse support vector machine model for classification problems is considered in this thesis.And the upper bound of L1-sparse penalty parameter in the model is discussed.In addition,a new method is proposed for reducing large-scale data sets,which suitable for the case that the line connects two class centers is almost vertical to the classification hyperplane.Finally,numerical results for the University of California Irvine?UCI?data sets and hyperbolic spiral sample are given,and applications of the model in blog classification are studied.The research of this thesis mainly studies four aspects:Firstly,by adding L1 norm of partial multipliers to the dual problems of classification problems,the dual sparse support vector machine model is constructed.It can be reformulated as Lasso problem.Secondly,the upper bound of L1 penalty parameter in this model is considered based on Lasso theory.Thirdly,a sample screening method according to the location of the sample points for large-scale data sets is given.In order to achieve double sparsity,the filtered samples as training sets in the process of training sparse model are used.Finally,the model is applied to the classification problems of blog data sets,UCI data sets and hyperbolic spiral sample.The numerical experiments programming in Matlab and Python show that the proposed method have higher accuracy,less support vectors and computing time.The paper is organized as follows:The first part introduces the development status of the Text Classification,SVM,LSSVM and the sparse optimization.The second part introduces the standard LSSVM model and three kinds of sparse least squares support vector machine models.The third part proposed dual sparse support vector machine model based on lasso.And an upper bound of L1 penalty parameter in the model and a sample screening method for the large scale dataset are given.The fourth part gives the numerical results for same data sets and classification of the blog text.The last part is summary and outlook.
Keywords/Search Tags:Classification Problem, Least Squares Support Vector Machine, Lasso, Penalty Parameter
PDF Full Text Request
Related items