Font Size: a A A

Modeling And Application Of Nomogram Based On Bayesian Model Averaging

Posted on:2017-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2334330485981374Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Background:With the development of biostatistics and the rapid growing demand of statistic method for clinical study,more and more statistical models have been widely applied in all aspects of clinical research,such as exploring the risk factors of disease,diagnosis tests,predicting prognosis and personalized medicine.Medical nomograms that use biological and clinical prognostic information to determine a statistical prognostic model,which generates a probability of a clinical event,have been widely used in all aspects of clinical research due to its user-friendly representation of a complex mathematical formula.The mostly widely used method to establish a nomogram is a stepwise procedure.However,this procedure have been thought of existing some unavoidable flaws such as yielding are biased higher,P value are biased small,without concerning of model uncertainty,and without the ability to prevent over-fitting,and so on.In addition,such procedure is mainly concerned with the imprecision in the strengths of the associations of the variables to events,but little or no attention is given to imprecision arising from the variable selection procedure as ignoring the uncertainty of models.Statisticians have paid more attention to the Bayesian model averaging as the improvement of Bayesian method,this method has concerned with the uncertainty of modes and inferred the effects of the variable on the occurrence of an event of interest based on a mean of the posterior distributions of models and variables,weighted by their posterior probabilities,thus to construct a accurate and appropriate model.However,the application of Bayesian model averaging in prognostic model is not widespread and need more investigation,especially the advantage and shortcoming compared with the penalized method ‘Lasso’ and stepwise in different situations of survival data.Aim:This study is aimed to explore the properties and the appropriate data situations for Bayesian model averaging.In addition,we also evaluated the stability and accuracy of model based on Bayesian model averaging compared with stepwise and lasso in different simulated data by varying the sample size,residual variances and model complexity.At last,we applied these methods to realistic data to validate our conclusions and improve the stability of nomogram.Methods:Our study simulated different survival data situations by varying sample size,residual variance and model complexity,the procedure of survival data simulation followed the Bender’s method.For Bayesian model averaging,we use the unit information prior to evaluate the posterior probability of variables.The threshold values of variables’ posterior probabilities were set as 50% and 95% to construct models separately.We choose the adaptive parameter lamda by using cross-validation method for lasso to set the shrinking parameter and establish models.The inclusion and exclusion p values for stepwise were set as 0.15 and 0.05.Also,we applied these methods to a realistic data from a randomized clinical trial of advanced hepatocellular carcinoma,to explore a more stable and accurate method to construct prognostic models.Results:There were little difference between stepwise and Bayesian model averaging using 50% as the threshold value of posterior probability in the probability of selecting true variables and the probability of not selecting a redundant variable when the sample size was small.As a result of more strict standards of variable selection,the Bayesian model averaging using 95% as the threshold value of posterior performed slightly worse than stepwise and lasso.However,this small difference should be interpreted with caution as these models constructed by these methods were of less practice applicability in such situation due to the severely over-fitting.Compared to stepwise and lasso,Bayesian model averaging performed much better in all aspect of model construction when the sample size come to larger.Firstly,the probability of not selecting a redundant variable was nearly 100% for 95% threshold value of posterior probability.The probability of not selecting a redundant variable of Bayesian model averaging changed to 70% and still be no worse than stepwise(60%)and lasso(70%),even reduce the threshold value of posterior probability to 50%.Secondly,these three methods performed similarly in the probability of selecting true variables,however,the probability of selecting true model for Bayesian model averaging was much higher than others due to its strength in excluding redundant variables.Bayesian model averaging also performed better than others in the precision of estimate of parameters with smaller bias and higher true value coverage.At last,the model constructed by Bayesian model averaging were not over-fitted when the model constructed by stepwise and lasso had significantly over-fitted the data.However,there were also some limitations in Bayesian model averaging.In our simulation study,we observed a obvious decrease in the model performance constructed by BMA when the sample size became smaller or the residual variance came to larger.Additionally,BMA performed not so well when the true variables existing correlations.By contrast,Lasso seemed to have more advantages in handling data embracing correlated true variables.When it came to the realistic data,the model constructed by BMA seemed to more stable and accurate,and the variables selected by BMA seemed to contain more prognostic information and clinical significance.Compared to BMA,stepwise and lasso had choose some variables without clinical significance and failed to need to prevent over-fitting.Conclusions:BMA has comparatively higher practice value in prognostic model construction and personalized medicine when the sample size is appropriate,especially with preliminary exploration of risk factors.It has more accurate estimation of parameter and stronger ability to identify redundant variables,leading to a more stable prognostic model containing valuable clinical prognostic information.The use of stepwise method in medical prognostic models may be outperformed by BMA.The advantages of lasso method maybe more apparent when the number of candidate variables increased and issues of multi-collinearity came to play.
Keywords/Search Tags:Bayesian model averaging, nomogram, survival data, overfit, simulation study
PDF Full Text Request
Related items