Font Size: a A A

Feature Screening For Ultrahigh-Dimensional Survival Data And Outlier Detection

Posted on:2019-06-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1317330548950133Subject:Statistics
Abstract/Summary:PDF Full Text Request
This dissertation addresses two issues:feature screening for ultrahigh dimen-sional survival data and outlier detection for survival data in exponential regression.For ultrahigh dimensional data,sure independent screening methods can ef-fectively reduce the dimensionality while ensuring that all the active variables can be retained with high probability.However,most existing screening procedures are developed for ultrahigh dimensional complete data and can not be applicable to censored survival data.Three novel model-free screening methods were proposed through censored cumulative residual,correlation rank,the Kolmogorov-Smirnov test statistic that are specially tailored to the ultrahigh-dimensional survival data.The sure screening property and ranking consistency were established under some mild regularity conditions,and their superior performance over existing screening methods were demonstrated by extensive simulation studies.As an illustration,the proposed methods were applied to the mantle cell lymphoma study.Most real-world data sets contain outliers that may have serious effects in estimation,inference and model selection.Although the methods for the outlier detection are relatively well developed for the completely observed data,the detec-tion of outliers for the censored survival data has received little attention.Here,we propose a penalized likelihood method to detect the possible outliers in the exponen-tial regression model while it is utilized to fit the censored survival data.We recast the outlier detection issue into parameter estimation in a high dimensional regu-larization regression and employ the coordinate descent algorithm to facilitate the computation.It features that the proposed method can simultaneously cope with outlier detection and estimation for the regression coefficient.From both extensive simulation studies and an illustrative real example,it is shown that the proposed method works quite well in outlier detection as well as parameter estimation for the exponential regression model.
Keywords/Search Tags:Censored data, High-dimensional data, Kaplan-Meier estimator, Outlier detection, Survival analysis, Variable screening
PDF Full Text Request
Related items