Font Size: a A A

A Study On Effects Of Censoring Proportions To Cox Regression Model In Survival Analysis

Posted on:2010-01-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:J QianFull Text:PDF
GTID:1100360275497328Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
ObjectiveThe Cox regression is one of the most common methods in survival study.It is widely focused on what the reliability and accuracy of the Cox regression are for the survival data with largely censoring proportion in practice.It is lack of systematic research about these issues at present.The aim of this study is to explore the effects of censoring proportions on the Cox regression model and determine the limits of censoring proportion when using the Cox model.The solution of these issues not only has an important influence on the study of censored data,but also provides a standard reference for the applications in survival analysis,so as to enhance the reliability of the analysis of risk factors and the quality of scientific research findings.MethodsIn term of the algorithm of the Cox partial likelihood estimate,we known that the regression coefficients are determined by the order of events occurrence and censoring,rather than specific values of survival time,and the censoring only provide information for the hazard function set of the Cox partial likelihood function.The estimate of Cox regression would be biased when lagerly censoring proportion.In this study,the Monte Carlo method was used to detect the bias,accuracy and reliability of the Cox model under the different censoring proportions. Parameters setting.1.Covariates.The single factor and multi-factors,two,four and eight covariates respectively,were taken into consideration.And irrespective factors would be considered in multi-factors analysis to evaluate the ability of filter factors in Cox model.2.Survival distribution.Of the known survival distribution,only three types satisfy the Cox proportional hazard assumptions.The survival times were simulated respectively based on following three distributions,the exponential distribution, Weibull distribution and Gompertz distribution.3.Censoring distribution.TypeⅠcensoring was set to truncation distribution,typeⅢcensoring(random censoring) was set to exponential distribution and uniform distribution.4.Types of covariates.Discrete and continuous random variable were implemented. Common distribution,such as two-point distribution,uniform distribution,normal distribution,Gamma distribution,were of intrest.5.Sample size.The sample sizes were determined upon the times of covariate number.The 20,40,80,…200 times were set in single factor analysis,besides these 10 and 500 times in multi-factors analysis.Sample size can be divided into three levels using the times between the sample size and the numbers of covariates: If sample size is less than 20 times the covariate number,defined as small sample size. If sample size is between 20 times and 100 times the covariate number,defined as moderate sample size.If sample size is more than 100 times the covariate number, defined as large sample size. 6.Simulation repetition:500 replications are run for each simulation. Criteria of evaluation.1.Bias.The relative mean absolute deviation of the regression coefficient(MAD) and the relative signed error of the regression coefficient(BIAS) were applied to assess the bias.MAD is the relative absolute deviation of the regression coefficient under censoring data to complete data.BIAS is the relative signed error of the regression coefficient under censoring data to complete data.The smaller the BIAS and MAD, the less the bias.2.Accuracy.The ratio of standard deviation under censoring data to that under complete data(Stdratio) was used to measure the accuracy.More close to 1 of the Stdratio value,more accurate.3.Validity.The ratio of significance of censoring data over complete data(Propower) was employed to evaluate the validity.The larger the Propower,the more valid. Proceding of simulation study.1.The complete survival data sets were simulated based on three types of survival distributions mentioned above with different parameters after the inverse function of the cumulative baseline hazard function was calculated.2.Using iterative calculations,the different censoring time data sets were generated from the simulated complete data set by means of random sampling under various censoring conditions.Combining censoring time and survival time,censored survival data sets were produced with different censoring proportion.3.The golden standard was defined as the estimations of the Cox model under the complete data.The investigated models of interest with different censoring proportion were evaluated with respect to parameter estimation as well as significant test,and so forth.The designed evaluating criteria which were calculated from the censoring data models were compared with that of the golden standard model. 4.The results of simulations were analyzed in terms of criteria of evaluation under various censoring conditions.The criteria of evaluation were the monotone function of censoring proportion.In order to study the properties of monotonicity,the concept of difference was introduced.The positive and negative changes of first-order difference represent the monotonous of the function,whereas the changes of second-order difference represent the acceleration.The function is approximation linear when second-order difference is around zero.And the function will accelerate increase(or decrease) when second-order difference deviate from zero.ResultsBias.Bias of the Cox regression model was mainly described by MAD and BIAS.1.The results of MAD and BIAS were similar under different types of distribution and covariates.2.Less bias occurred under typeⅠcensoring.The results were similar under various distribution when typeⅢcensoring was under investigated.3.MAD which was influenced by the magnitude of regression coefficient was larger while the value of the coefficient was smaller.4.The bias increased gradually with the increase of censoring proportion.More over, the bias would be accelerated in the case of large censoring proportion.The position of the bias acceleration associated with sample size.The relationships of these two were listed as follows in term of the sample size and its multiple links with the numbers of covariates.Small sample size(below 20 times),accelerated bias occured at 70%censoring.Moderate sample size(20 to 100 times),accelerated bias occured at 80%censoring.Large sample size(above 100 times),accelerated bias occured at 90% censoring.Accuracy.The deviation of regression coefficients was described by Stdratio.The value of Stdratio is mainly determined by the proportion of censoring.It is a monotonous increased function of the censoring proportion,and this upward trend will accelerate at 70%censoring.The increased and accelerated trend will not be affected by the sample size which could concluded from the graphics.At the same time,the values approximate the same under various parameter conditions.Validity.Validity of the Cox regression model was described by Propower.The value of Propower was influenced by the covariate variation,sample size,as well as other study factors.And as a rule,it decreases gradually with the increasing of censoring proportion.Extreme values.Extreme values occur frequently in small samples and large censoring.When Stdratio is greater than 100,the minimum MAD is 4.5,the maximum value is more than 1000, therefore the estimation produced by Cox regression analysis makes no sense. Compared with randomly censoring,less extreme values were detected when typeⅠcensoring.More attention should be paid to the appearance of extreme value when sample size is small.If the events count less than 10,the occurrence of extreme value was assumed to happen with the probability of 5%,and the probability rise to 20% while the events count less than 6.ConclusionThe increasing censoring proportion will make bias increased,accuracy and validity decreased.The accuracy of the outcome is supposed to drop dramatically with the larger acceleration of Stdratio when censoring proportion is 70%or more.That the enlarging acceleration of bias and the incidence of extreme value should be noted when the censoring is over 70%with small sample size.As with the moderate sample size,the bias was assumed to accelerate increasingly while the censoring proportion is 80%or more.In the case of large sample size the bias acceleration would boost up while the censoring proportion is 90%or more.Whenever someone conducts survival analysis with the Cox regression model,it is suggested that the censoring proportion should be less than 70%if sample size is within 20 times of covariates number,less than 80%if sample size is between 20 to 100 times of covariates number,and less than 90%if sample size is over 100 times of covariates number.It comes to conclusion that censoring proportion should be limited to reasonable level for the Cox regression model to conduct survival analysis in practice.
Keywords/Search Tags:Cox regression, survival distribution, censoring proportion, covariate
PDF Full Text Request
Related items