Font Size: a A A

Feature Screening For High Dimensional Multicollinear Data

Posted on:2021-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:J H HuangFull Text:PDF
GTID:2370330620468673Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the continuous development of information technology,statisticians are often able to collect high-dimensional data in some fields such as finance,weather forecasting,and genetic research.However,due to the high dimensionality,traditional statistical analysis methods and variable selection methods are no longer robust and difficult to apply.And there are other problems to be overcome.For example,when the error distribution is thick-tailed,they are generally inefficient or even become unsuitable;and when there is serious multicollinearity between independent variables,It will also seriously interfere with the screening effect of the variable selection method.In order to overcome multicollinearity,this paper proposes a robust high-dimensional feature screening method that can cope with the existence of multicollinear high-dimensional linear data.The main work of this article is as follows:In Chapter 1,it expounds the research status and history of variable screening in the face of high-dimensional data,reviews and learns some common feature screening methods,and finally explains the content arrangement and innovation of this article.In chapter 2,a high-dimensional feature screening method for multicollinearity is proposed,which can process high-dimensional data with multicollinearity.Many current researches on high-dimensional linear models are based on a single marginal effect.The selection of variables depends on the independence of the variables.This makes the instability of variable selection possible when there is multicollinearity between variables.This paper introduces the concept of net effect,so that the net effect of the independent variable replaces its marginal effect,and proposes a feature selection method based on the global effect,which makes the screening method more applicable,and further proves that Determine screening properties.In chapter 3,the comparison with other screening methods in numerical simulation and case analysis shows that the screening method after generalization is more stable.In Chapter 4,it summarizes the feature selection methods proposed in this paper,and looks forward to the directions that can be further studied.
Keywords/Search Tags:High-dimensional linear data, Multicollinearity, Feature screening, Sure screening property
PDF Full Text Request
Related items