Font Size: a A A

Research On The Advantages And Disadvantages Of Lasso And Its Improved Methods In Variable Selection

Posted on:2019-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:K HaoFull Text:PDF
GTID:2370330566997119Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
When we model with multivariate data,if we ignore an important influencing factor,the deviation of the model will be very large.So we want to introduce as many variables as possible.However,the introduced variables may increase the collinearity of the design matrix and affect the effect of modeling.If the introduced variables have no significant impact,then the model's interpretability will be poor.Lasso,SCAD,adaptive Lasso and elastic network can all be used to select variables and estimate parameters by shrinkage estimates.But their modeling effects are different in different contexts.In this dissertation,we study the advantages and disadvantages of Lasso and its improvement methods.The specific research is divided into the following four parts:First,we introduce the correlation theory of Lasso,and give the estimation method of Lasso parameter and the algorithm of realizing Lasso.Second,we introduce the related theories of SCAD,adaptive-Lasso and elastic net,and study their relationship with Lasso by comparing their penalty terms.Third,in this dissertation,experiments are carried out under the background of linear model.Then we will compare the ability of these methods to select variables.Fourth,We conduct experiments in the context of nonlinear models,and we compare the effects of modeling with these methods.Then we use a set of diabetes data for example analysis.By studying the characteristics of these four methods in variable selection,we can find which method is better under different background.It is helpful for people to choose the proper regression method in practical application.
Keywords/Search Tags:variable selection, Lasso, SCAD, adaptive-Lasso, elastic net
PDF Full Text Request
Related items