Font Size: a A A

Construction Of A Risk Prediction Model Of Breast Cancer-related Lymphedema Based On Machine Learning

Posted on:2024-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:J L DuFull Text:PDF
GTID:2544307079477174Subject:Care
Abstract/Summary:PDF Full Text Request
Objective: Models that use machine learning algorithms to predict the occurrence of cancer symptoms can be used to aid clinical decision-making to improve the quality of cancer care.This study aims to develop and validate a series of breast cancer-related lymphedema risk prediction models through machine learning algorithms,in order to identify lymphedema high-risk groups early and reduce the incidence of lymphedema after breast cancer surgery.Methods: On the basis of fully exploring the risk factors of breast cancer-related lymphedema(BCRL),a lymphedema risk factor questionnaire was developed after breast cancer surgery,and clinical data of patients undergoing breast cancer surgery in a grade-III hospital in Sichuan Province from January 2012 to July 2022 were collected.Firstly,Mann-Whitney U test was used to screen out the features with statistically significant differences(P<0.05),and then Z-score standardization method was used to standardize the above characteristic data that passed Mann-Whitney U test.Then,the regression model for least absolute shrinkage and selection operator(LASSO)was established to screen out the most valuable features for predicting BCRL.The whole data set is randomly divided into training set and test set according to 7:3.The training set is used for model construction,and the test set is used for optimal model selection.Nine kinds of classification models including Random Forests(RF),Decision Tree(DT),Support Vector Machine(SVM)K-Nearest Neighbor KNN),Extra Trees(ET),Stochastic Gradient Descent(SGD),Extreme Gradient Boosting,XGBoost),Light Gradient Boosting Machine(Light GBM)and Logistic Regression(LR),are developed.Finally,the model performance was evaluated according to accuracy,sensitivity,specificity,recall rate,accuracy rate,F-score and Area Under Curve(AUC),and the optimal model was selected according to AUC value.Results: A total of 670 patients were investigated,469 in the modeling group and201 in the validation group.A total of 175 cases(26.1%)had BCRL.LASSO regression model screened out the most valuable 14 features for predicting BCRL,including: Body Mass Index(BMI),hypertension,staging of axillary lymph nodes,type of operation,type of lymph node operation,dominant side operation,preoperative axillary lymph node puncture,number of positive lymph nodes,postoperative radiotherapy,endocrine therapy,and Lymphedema Risk-Reduction Behavior Checklist,(LRRB)such as "not ignoring upper limb edema","avoiding vigorous exercise"," avoiding lift heavy objects" and "avoiding fatigue of affected limb".The index ranges of the 9 models in the training set were successively: accuracy(0.68-1),sensitivity(0.99-1),specificity(0.85-1),recall rate(0.99-1),accuracy rate(0.98-1),and F score(0.46-1).The index ranges of the 9 models in the test set are: accuracy(0.88-0.76),sensitivity(0.77-0.52),specificity(0.76-0.91),recall(0.77-0.52),accuracy(0.2-0.5),F score(0.28-0.51),AUC value(0.72-0.84).Overall,ET achieved the best performance in predicting the accuracy(0.76),accuracy(0.5),sensitivity(0.52),specificity(0.91),recall rate(0.52),F score(0.51)and AUC value(0.84)of BCRL.Conclusion: It is found in this study that the ET model based on machine learning has ideal comprehensive performance such as precision,accuracy,specificity and AUC value,which can accurately identify patients at high risk of BCRL and help nurses to take targeted nursing measures early to prevent the occurrence of BCRL.
Keywords/Search Tags:Breast Cancer, Lymphedema, Machine Learning, Prediction Model
PDF Full Text Request
Related items