Font Size: a A A

Research On Improved Linear And Nonlinear Variable Selection Methods

Posted on:2021-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y S ZhaoFull Text:PDF
GTID:2427330626955579Subject:Statistics
Abstract/Summary:PDF Full Text Request
The variable selection method is often used in linear regression,which has been studied by many scholars.In real life,the existing models are often complicated,and most of the variables show nonlinear relationships.At this time,it is not flexible enough to use simple linear statistical models.And there will be large deviations when conducting research.With the continuous development of data recording technology,the accumulation of data is getting easier,and high-dimensional data will be generated accordingly.It is worth studying that how to quickly find valuable variables from massive data.And it has also attracted the attention of many scholars.Complex data often has intertwined and associative relationships,and the redundancy between variables is also obvious.It is not only difficult to deal with,but also consumes a lot of computing time.In order to solve these problems,this article introduces split and conquer method when researching high-dimensional data.This method can divide the data into blocks and select variables on each block of data,which can better reduce the redundancy between the data,and can also effectively reduce the computing time of the computer.In real life,most of the data have nonlinear relationships.In order to better select important variables that are not only linear relationships in massive high-dimensional data,a non-parametric additive model is introduced for variable selection.The unbiasedness and effectiveness of this method,and the superiority of non-parametric additive model in processing nonlinear data have been verified in theory and practice.Therefore,on the one hand,combining with the split-and-conquer method solves the problem of long time consuming.On the other hand,it guarantees the effectiveness of variable selection in nonlinear models.With the advent of the era of big data,the requirements for high-dimensional and massive data processing methods are growing.It is not only required that the model is suitable for linear models,but also for nonlinear models.The more important thing is to ensure the validity of time,which has become a hot topic of current research.In this regard,this article has done the following work.Firstly,the split-and-conquer method is introduced into the variable selection method.Most of the classic methods do not take into account the problem of excessively long time consumption,and iterative calculations consume a lot of time during the operation.The method is verified through the application of examples,and the improved method has an excellent effect on time operation.Secondly,for the non-parametric additive model of the non-linear model,the model is directly introduced into the high-dimensional model.It is showed by numerical simulation and example verification that the method is effective in high-dimensional non-linear data.Finally,the full paper is summarized.Further improvements and future research directions are pointed out.
Keywords/Search Tags:Split-and-conquer, Variable selection, Nonparametric additive model, Nonlinear model, Linear model
PDF Full Text Request
Related items