Font Size: a A A

Ultra-high Dimensional Longitudinal Quantile Feature Screening Based On Modified Cholesky Decomposition

Posted on:2024-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:2530307106499194Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the increasing complexity of research issues,ultra-high dimensional longitudinal data has become increasingly common.Although ultra-high dimensional longitudinal data contains more information and value,the sparsity of ultra-high dimensional data and the intra group correlation of longitudinal data make it more difficult to conduct general statistical research.Therefore,the research on feature screening of ultra-high dimensional longitudinal data is of great significance.Compared to mean regression,quantile regression is not sensitive to outliers,and the results of the model are more robust.This paper studies the problem of feature screening for ultra high dimensional longitudinal data in quantile regression models.The specific content is as follows:Firstly,this paper proposes an optimal quantile estimation equation based on a quantile regression model with ultra high dimensional longitudinal data.For the unknown covariance matrices in that estimation equation,this paper proposes a novel estimation idea in the context of ultra-high dimensions,and establishes an iterative feature screening algorithm.The first step is to quickly reduce the ultra-high dimensional longitudinal quantile regression model to a low dimensional sparse linear quantile regression model using an independent quantile feature screening method,and the covariance matrices corresponding to the sparse linear quantile regression model are used as the working covariance matrices,and the modified Cholesky decomposition method is introduced to dynamically model them.In the second step,the estimates of the working covariance matrices are substituted into the quantile optimal estimation equation,and set the vector of regression coefficients to a zero vector to establish a screening index for feature screening.In the third step,the covariance matrices in sparse linear quantile regression model obtained in the second step are used as new working covariance matrices,and the modified Cholesky decomposition method is used to dynamically model them.Subsequently,the feature screening process in the second step is repeated.The second and third steps are repeated until the estimates of the working covariance matrices reach convergence.Substitute the converged covariance matrix estimation into the quantile optimal estimation equation to establish a new screening index,thereby obtaining a quantile feature screening method for ultra-high dimensional longitudinal based on the modified Cholesky decomposition.Secondly,under some regularity conditions,we prove the consistency of the estimates of the parameters to be estimated and the working covariance matrix under the modified Cholesky decomposition.Then,the large-sample asymptotic properties of the iterative feature screening method proposed in this paper are discussed.It is demonstrated that the screening index can separate important and non-important covariates with probability1,as well as that the screening process can filter the set of important covariates with probability 1 and reduce the ultra-high-dimensional model to a sub-model of polynomial order size.Thirdly,through random simulations,it is verified that the method proposed in this paper has higher accuracy and robustness of screening compared with independent quantile feature screening methods under limited samples.Using the yeast cell cycle gene expression dataset,the practicality of the method proposed in this paper is illustrated by identifying yeast cell cycle transcription factors.
Keywords/Search Tags:Longitudinal data, Ultra-high dimensional feature screening, Modified Cholesky decomposition, Quantile regression
PDF Full Text Request
Related items