Font Size: a A A

Feature Screening Ultrahigh Dimensional Longitudinal Data

Posted on:2019-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:F J WangFull Text:PDF
GTID:2370330545970156Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Complex data are often encountered in the study of practical problems,in which ultrahigh dimensional data and longitudinal data are widely used in the fields of big data such as medicine,economics and meteorology.The characteristic of ultrahigh dimensional data is that the dimension p is far greater than the sample size n,which makes the compute cost of the ultrahigh dimensional data greatly increased,also makes the statistical accuracy and the stability of the model algorithm greatly reduced.This leads to the traditional method of reducing dimension analysis,such as principal component method,optimal subset method,variable selection method and so on,can not solve the problem of ultrahigh dimension data effectively.The problem of ultra-high dimension has sparsity characteristics generally,that is,only a few covariates are correlated with response variables,which makes the fast dimensionality reduction of them possible.Longitudinal data reflect the characteristics of inter individual independence and intra individual correlation.When it combined with ultrahigh dimensional problems,researchers have to face new challenges.Based on the structural features of ultrahigh dimensional longitudinal data,this paper studies the problem of feature screening for ultrahigh dimensional longitudinal data in the linear model and the additive model under the sparsity assumption.In the ultrahigh dimensional linear model,we generalize the Sure Independence Screening(SIS)method.Using the intra group correlation structure matrix of longitudinal data,we construct MSIS method with working correlation matrix.We also establish the sure screening property for the proposed procedure,it makes sure the important predicted functions are selected with probability tends to 1.In the ultrahigh dimensional additive model,we generalize the NIS method,introduce the working correlation matrix,and use the QIF(quadratic inference function)to avoid the direct estimation of the unknown working correlation matrix.We construct the nonparametric marginal correlation measure of the important variables,and establish the QIF-NIS screening process.Theorem proving shows sure screening property.In this paper,based on the intra group correlation structure of longitudinal data%we construct marginal feature screening methods innovatively under the ultrahigh dimension problem.The theory proves that the reduced dimension screening process satisfies the deterministic screening property,and studies its finite sample properties from the numerical simulation.The results show that the proposed methods have excellent performance from the theoretical and numerical simulation.In the theory,we show that the proposed dimensionality reduction screening process satisfies the sure independence screening property and study the finite sample property through numerical studies.The results show that the proposed methods are excellent in theory and numerical simulation.
Keywords/Search Tags:ultrahigh dimensional longitudinal data, feature screening, quadratic inference function, independence screening property, nonparametric regression
PDF Full Text Request
Related items