Font Size: a A A

Research And Application Of Variable Selection Method In Complex Group Data

Posted on:2020-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:D D LiFull Text:PDF
GTID:2417330578959811Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the dimension of data acquired by people is getting higher and higher,the data structure is becoming more and more complex,and the variables are mostly in the form of groups.How to effectively reduce the dimensionality of ultra-high-dimensional complex group data needs further study.ADS(Adaptive Dantzig Selector)method is an important method for processing ultra-high dimensional data.This paper mainly studies the asymptotic normality of this method in partial linear model and the theoretical properties and practical application of GADS(Group Adaptive Dantzig Selector)method in complex group variable data.Specific research contents and results are as follows:(1)Applying ADS Method to Partial Linear Model,The theory proves that the ADS method in partial linear model has asymptotic normality.The data simulation results show that the estimated value of ADS method is more accurate.The data examples show that the ADS method is better than Lasso method in dealing with ultra-high dimensional sparse data.(2)Combining ADS method with data structure characteristics of complex group variables,a GADS method for dealing with group variables is proposed.The main principle of this method is to impose different penalties on different group coefficients on the basis of GDS penalties,so as to select group variables more accurately.The asymptotic normality of GADS method is proved theoretically,and an example is given to verify that the estimated value of GADS method is better than that of Group Lasso method when dealing with complex group data.(3)In this paper,an S-GADS(Screen Group Adaptive Dantzig Selector)method based on linear model is proposed for ultra-high dimensional group data.This method is to reduce the dimension of ultra-high-dimensional group data,reduce the dimension of data to the operable range,and then use new GADS method to select variables to get the estimated value.Finally,numerical simulation verifies that S-GADS method has good accuracy and model interpretability in ultra-high dimensional group data.
Keywords/Search Tags:Partial linear model, Complex Group Variables, Asymptotic Normality, Dimension reduction screening
PDF Full Text Request
Related items