| Objective:To investigate the feasibility and accuracy of predicting pathological complete remission(pCR)using a classification model based on the changes in the size of the ultrasound mass after neoadjuvant chemotherapy,and to analyze the influencing factors of pCR.Methods:Clinical pathological data of breast cancer patients undergoing neoadjuvant chemotherapy from January 2011 to September 2016 at the First Hospital of Jilin University were collected.351 patients were included in the study.All patients were followed by one cycle of chemotherapy,before the next chemotherapy line breast ultrasound examination,and record the size of the tumor after each chemotherapy;calculate the size of the tumor(diameter × diameter line)to the area of the tumor(multiple lesions for the sum of the area)as a feature,pCR is classified as a classifier;six classification algorithms commonly used in machine learning(SVM,KNN,NBayes,DTree,RF,XGBoost)are used to model and predict pCR,and the six classifiers use default parameters.For all molecular typing,the real area of the tumor after 6 cycles of chemotherapy was used as the feature input of the above 6 classifiers,pCR was used as a classifier for training;5fold CrossValidation method was used for cross validation,and a confusion matrix was introduced,use four indicators(Sn,Sp,Acc,Avc)to evaluate the model status;find out the best feature combination for each classifier based on each molecular type and sort by the exhaustive method to know the best classification accuracy and feature combinations for specific molecular typing.Then 3 times of the real mass area before chemotherapy,pCR was used as a standard for training;5 fold CrossValidation method was used for cross-validation,and confusion matrix was used to evaluate the model status.SPSS(verson.21.0)was used to perform single factor analysis of pCR using chi-square test,Fiser exact test was performed when necessary,and binary logisticregression was used for multivariate analysis.Results:The overall pCR rate in this study was 22.5%(79/351).The pCR rate of Luminal B(Her-2positive)type was 22.97%(17/74).The p CR rate of LuminalB(Her-2 negative)type was 9.74%(15/154).The Her-2 overexpression rate of pCR was 36.51%(23/63),and the triple negative rate was 40%(24/60).Univariate analysis showed that T,molecular typing,ER,PR,Her-2,and Ki-67 were significantly associated with pCR,and the difference was statistically significant(p<0.05);multivariate analysis showed that PR and Ki-67 were pCR independent influencing factors;PRmore likely to obtain pCR(OR=2.795,95% CI: 1.227-6.363,p=0.014);when Ki-67>40%,the higher the Ki-67,the greater the likelihood of obtaining pCR(OR = 3.561,95% CI: 1.496-8.480,p=0.004).T4 was less likely to have pCR than T1(OR=0.093,95% CI: 0.011-0.806,p=0.031).When modeling p CR with six tumor masses characterized by chemotherapy,LuminalB(Her-2 positive),LuminalB(Her-2 negative),Her-2 overexpression,and triple negative use RF,KNN,XGB,and XGBoost/RF classifications,respectively.The algorithm modeling predicts the p CR effect is good,accuracy rate(Acc)were 81.07%,89.63%,66.79% and 82.98% respectively.After screening by the exhaustive method,the best classification algorithm for the four molecular typing prediction pCR is SVM,RF,DTree,and RF,respectively.Acc were 83.75%,90.92%,74.87%,and 84.8% respectively;and pCR were modeled when characterized by three tumor masses before chemotherapy.LuminalB(Her-2 positive)type,LuminalB(Her-2 negative)type,Her-2 over-expression and triple-negative use of XGBoost,KNN,KNN and DTree/XGBoost classification algorithms to model prediction of p CR better,Acc were 74.46%,90.28%,74.62% and 76.85% respectively.Conclusions:1.The characteristic of tumor area after NAC,pCR is a kind of standard.Using computer classification algorithm to establish model predicting pCR is not only feasible,but also has high accuracy,and different molecular classification best classification algorithm is different.2.Breast ultrasound has important clinical value in predicting pCR.3.There was a significant difference in the rate of pCR among different molecular types.The rate of pCR was higher in triple-negative and Her-2 over-expressed breast cancers.4.PR and Ki-67 were independent influencing factors of pCR,and the rate of p CR in breast cancer with high expression of PR(-)and Ki-67 was higher. |