Font Size: a A A

Establishment And Visualization Of A Noninvasive Prediction Model For Hepatic Fibrosis Based On Biochemical Information Of Hepatitis B Patients

Posted on:2021-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:W S WangFull Text:PDF
GTID:2404330611491625Subject:Public health
Abstract/Summary:PDF Full Text Request
Objective: This study aims to determine reasonable methods of data dimension reduction and variable screening to find out appropriate predictive factors,then using specified serum biochemical indexes of existing hepatitis b patients that had received liver biopsy,establishing a noninvasive predictive model of liver fibrosis in hepatitis b patients.After that,the visualization of this model should be finished for screening out which parts of patients need to take the biopsy.As a result,patients can do self-examination and figure out the actual state of their own body.Therefore,it can save cost,increase economic efficiency,and better support the clinical decision-making,diagnosis and treatment.Methods: In this study,all 1224 outpatients with liver disease were collected from shengjing hospital affiliated to China medical university from 2009 to 2014,and we brought 867 patients into the research after filtered from the inclusion and exclusion criteria.The frequency and composition ratio was used to describe the categorical variables.The univariate analysis was then performed after the description.The median and upper and lower quartiles were used to describe the continuous variables.Then all data were divided into two groups according to a ratio of 7:3,for model training and verification respectively.The data in training group were comprehensively screened by two screening methods of LASSO and random forest to determine the candidate variables.The selected variables were incorporated into the model after being divided into discrete ones by decision tree binning,and the model coefficients of each predictive variable were obtained by using logistic regression fitting model.Then,the predictive ability of the model is evaluated by using the calibration curve,the Receiver Operating Characteristic Curve and the areas under the curve,and the final benefit of the model is evaluated by decision curve.Finally,the model was visualized into a nomogram plot.Results: After the univariate analysis of basic situation description,21 predictors were retained.867 patients were divided into the training set and the validation set according to a 7:3 ratio,and there was no statistical difference between the two groups.Then process the training data by dimension reducing and variables selecting,and the LASSO dimension reduction analysis ruled out six variables,while random forest analysis excluded seven variables.The combined variable selection ruled out 10 predictors,and kept 11 variables: TT,APOB,DD,CHOL,AST,ALPK,APTT,TBA,GGT,PLT,and AFP,which were used to build the final model.The single-variable decision tree binning discretized these 11 variables,4 of which were transformed into two-category indicators,5 were transformed into three-category indicators,and 2 were transformed into four-category indicators.A binary logistic regression model was established with the binned variables as explanatory variables and the staged result of liver biopsy as the dependent variable.We kept seven variables in the final model,the APOB and PLT could be explained as protective factor of liver fibrosis,and other five variables(AFP,GGT,AFP,APTT and TT)could be explained as risk factors of liver fibrosis.The test of the univariate fitting of the model found that the overall fitting of the model was good,and no variable appeared over-fitting.In addition,the prediction of the calibration curve in training set and the validation set is closely consistent with the actual situation,and the prediction is similar to the actual situation.In the Receiver Operating Characteristic Curve part,the area under the curve in training set is 0.834(0.802-0.862),and the area under the curve in the validation set is 0.818(0.765-0.863),which means that both two groups got a score more than 0.8,as a result the prediction accuracy of the model is pretty high.Conclusion: In this study,we got 11 candidate variables after the joint screening by the combination of LASSO and random forest,and these variables were then transformed into discrete ones by decision tree.The final model kept APOB and PLT as the protective factors,while GGT,TBA,AFP,APTT,TT as the risk factors for predicting fibrosis.The performance of our model is great.The area under ROC curve is 0.834 and 0.818,respectively,and the accuracy and comprehensive benefits are higher than those of the general model.The transformed nomogram is more intuitive and easy for using,which is worthy of wide promotion and can also provide a basis for other studies on fibrosis predicting.
Keywords/Search Tags:Liver fibrosis, LASSO, Random forest, Decision tree, Nomogram
PDF Full Text Request
Related items