Font Size: a A A

Feature Selection Of Ultra-high Dimensional Sparse Data Based On Composite Quantile Regression

Posted on:2019-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:X LiFull Text:PDF
GTID:2350330548457617Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the development of computer technology and artificial intelligence,data explosion is one of the most popular issues in the contemporary era.In the ultra-high dimensional data,the sample size of the data has increased significantly.At this time,only a small number of covariates are associated with the response variables.The model exhibits sparse characteristics and the model parameters are poorly interpreted.Statisticians face the task of identifying the most important features and construct an optimal interpretation model that links these important features with response variables.Extracting useful features from ultra-high-dimensional data is the basis for modeling ultra-high-dimensional data.Because the model is sparse at this time,it is important to remove the most obvious non-impact feature characteristics before any accurate analysis of ultra-high-dimensional data.Due to the high dimensionality,many traditional modeling methods and high dimensional data variable selection methods are not suitable for ultra-high dimensional data analysis.In recent years,mathematicians have developed some algorithms for this goal.The more feasible strategy is to establish a two-stage feature selection process.In the first stage,the fast and efficient variable screening process is used to reduce the feature dimension to the appropriate size below the sample size.Then use some effective method to select important variable for the reduced ultra-high dimensional data.A method for feature selection of ultra-high dimensional data is proposed in this paper.Based on the composite quantile regression model,a sparsity-restricted composite quantile estimation model is proposed to implement the first-stage process of ultra-high-dimensional data feature selection and reduce the feature dimension to The appropriate size below the sample size.At the same time,the MM algorithm and the IHT iterative hard threshold algorithm were introduced to solve the sparse restriction composite quantile estimation model.In the second stage of feature selection,LASSO and SCAD penalty likelihood methods were used to select important features of the reduced-dimensional data.The screening method in the article naturally adopts the combined effect of features in the screening process,which makes it inherently superior to the existing methods.The method in this paper has been further supported by simulation studies under many modeling settings.
Keywords/Search Tags:Ultra-high dimensional Sparse data, Composite Quantile Regression, IHT algorithm, MM algorithm
PDF Full Text Request
Related items