Font Size: a A A

Variable Selection Method Based On High-Dimensional Multiple Correlation Coefficients

Posted on:2022-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:X S WangFull Text:PDF
GTID:2517306491460274Subject:Statistics
Abstract/Summary:PDF Full Text Request
Regression analysis method is one of the most widely used multivariate statis-tical analysis methods.Its purpose is to deal with the relationship between multiple variables.It is well known that variable selection in multivariate linear regression is essential for the interpretation,subsequent statistical inferences and predictions of the statistical problem.In particular,classic variable selection methods are mainly divided into two categories:one is the selection criteria established based on opti-mization ideas,such as information criteria AIC,BIC,GIC,C_pcriteria and so on;the other is to screen variables by testing the regression coe cients.With the rapid development of the Internet,the dimensionality of the data set has become larger and larger,which makes the above classic statistical methods no longer applicable.Therefore,this article considers the variable selection problem when the number of response variables,the number of explanatory variables and the sample size in the multiple linear regression model tend to infinity proportionally.This paper applies the large dimensional random matrix theory to the multiple correlation coe cients between large-dimensional response variables and explana-tory variables,and proposes two variable selection methods accordingly:one is the variable selection method based on consistency;the other is the multiple testing method based on the control of false discovery rate(FDR).We give the corre-sponding algorithms,and verify the superiority of our proposed methods through simulation and real air quality data sets,respectively.
Keywords/Search Tags:Large-dimensional multiple linear regression, Variable selection, Consistency, False discovery rate control
PDF Full Text Request
Related items