Font Size: a A A

Research On Liver Cancer Prediction Model Based On Interpretable Machine Learning

Posted on:2023-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:K YuanFull Text:PDF
GTID:2544306806969719Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
It is difficult to diagnose liver cancer from the early stage to the middle stage with the help of a series of traditional imaging methods,and it may be difficult to identify liver cancer patients with the help of a series of traditional liver cancer diagnosis and treatment.How to diagnose liver cancer from the early stage to the middle stage? The machine learning model can be used to mine the early examination information of cases,establish a prediction model,identify the cases that will eventually develop cancer as much as possible,reduce and prevent missed diagnosis as much as possible,and obtain the cancer probability of the sample in advance before further examination(such as CT,histopathological biopsy,etc.),so as to help cancer patients strive for the golden time of treatment to the greatest extent.Using machine learning methods can often obtain models with good prediction results,but when solving practical problems,especially more rigorous problems such as clinical diagnosis,it is very important to obtain the reliability of the model.Generally,machine learning models only improve the prediction accuracy,but will not explain why the model has achieved such good results in some specific fields such as clinical research,We can’t train the prediction model for the data like the traditional machine learning model,only pursue the prediction effect of the model and ignore the interpretability of the model.In the field of clinical research,we also need to make the model interpretable,which is very important for the development of research.With the development of interpretable machine learning research,there are many methods to explain the role of models and features.Make the "black box" of machine learning transparent.Only by reasonably explaining the model,can we better apply machine learning to the clinical field,let the machine learn the experience of experts,and even alleviate the problems such as long training cycle of experts,so as to realize artificial intelligence clinical diagnosis,truly combine data and machines to solve practical clinical problems.Taking the cohort data of hepatitis patients provided by an “AAA” hospital in Nanchang,Jiangxi Province as an example,this thesis obtains the basic information,blood examination,tumor markers and other characteristics of patients,carries out data cleaning,deals with redundant features and data loss,and selects the feature combination with great predictive effect through feature engineering.In order to deal with the problem of category imbalance,random oversampling,random undersampling and smote are used,Finally,the appropriate sampling method and proportion are selected to establish the logistic regression model,machine learning model,neural network model and integrated learning model.Taking the area under the curve(AUC),recall rate,accuracy and F1-score as the evaluation indicators,the optimal xgboost model is finally obtained,and its AUC reaches 0.84.Compared with the widely used logistic regression model in clinic,xgboost model improves the prediction accuracy outside the sample.For the machine learning model,the interpretable research is carried out,the working principle of the features in the model is explained,and it is concluded that the action direction of the features is basically consistent with the clinical practical significance.It mainly uses the feature importance,PDP diagram and Shapley value for analysis,which makes us more transparent to understand the internal principle of the complex model and increase the reliability of the model,which has a good explanation for the diagnosis of whether the case has cancer,which is conducive to doctors’ in-depth understanding of the case,reduce doctors’ workload and increase the best diagnosis time of patients.
Keywords/Search Tags:Disease diagnosis, Machine learning, Model interpretability, XGBoost
PDF Full Text Request
Related items