Font Size: a A A

Variable Selection Based On ISIS And Its Application For Semi-parametric Additive Hazard Model With High-dimensional Data

Posted on:2018-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Z J LiuFull Text:PDF
GTID:2310330536472252Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective To introduce the variable selection based on ISIS and its application for semiparametric additive hazard model with high dimensional data.And explored the advantages and disadvantages of AHAZLASSOISIS model,AHAZENISIS model,AHAZISIS model,AHAZSSCADISIS model through analyze the highdimensional survival data,to reveal the relationship between the time of death or other survival ends occurrence and the gene expression.Provide the basis for the diagnosis and treatment of disease and improve treatment plan from the genetic level.Methods Introduce the basic principle and methods of AHAZLASSOISIS model,AHAZENISIS model,AHAZISIS model,AHAZSSCADISIS model.Simulating the highdimensional,strong correlation and small samples bioinformatic data,the performance of the four models under different simulated data is compared.Finally,the prostate cancer research data from TCGA were used as an empirical study.Results(1)Simulation studies show that the results of different initial penalty functions have little difference in consistencyand accuracy.(2)In the aspect of consistency,performance of OSSCAD is the best in four kinds of rerecruiment function,SSCAD second,Lasso third and EN is the worst;but in terms of accuracy,OSSCAD and SSCAD is better,Lasso second,EN last.(3)In the case of various data,the different steps of the repenalty function SSCAD in terms of consistency,performance of steps=1 is the best,Steps=2,3,4,5 are closer;but in terms of accuracy,steps=1 is the worst,Steps=2,3,4,5 are closer too.(4)The accuracy of three kinds of initial penalty function and four kinds of rerecruitment function and the different steps of the repenalty function SSCADare higher when the covariate correlation coefficient is small,but the accuracy is lower when the covariate correlation coefficient is large.(5)The Independent screening for semiparametric additive hazards model was used to analyze the gene expression data of prostate cancer,AHAZISIS model and AHAZSSCADISIS model have better model interpretability in the empirical study.And according to the p value of logrank test,AHAZISIS model and AHAZSSCADISIS model have better predict performance in the empirical research.Conclusion The estimation accuracy of AHAZISIS model,AHAZSSCADISIS model were high,and its model interpretability were better than other model.AHAZLASSOISIS model and AHAZENISIS model had a poor performance when analysis of highdimensional,strong correlation and small samples survival data.Therefore,AHAZISIS model,AHAZSSCADISIS model were the ideal model when analysis of highdimensional,strong correlation and small samples survival data.
Keywords/Search Tags:High-Dimensional Data, AHAZISIS model, AHAZSSCADISIS model, AHAZLASSOISIS model, AHAZENISIS model
PDF Full Text Request
Related items