| Objective: To explore the predictive value of preoperative ultrasound radiomics features in molecular classification of breast cancer using various machine learning algorithms.Methods: The clinical and ultrasound data of 259 breast cancer patients who visited our hospital for the first time from October 2020 to September 2022 were analyzed.Taking postoperative pathological and immunohistochemical results as the gold standard,breast cancer was divided into four distinct molecular subtypes: Luminal A,Luminal B,human epidermal growth factor receptor-2(HER-2)-over expression and triple-negative types.Differences in clinical and pathological characteristics between molecular typing were analysed by one-way ANOVA,chi-square test or Fisher’s exact test.The four classifications were converted into two classifications to predict the molecular subtypes of breast cancer: Luminal A and non-Luminal A,Luminal B and non-Luminal B,HER-2overexpression and non-HER-2 overexpression,TN and non-TN breast cancer.Each group was randomly divided into training group(n=181)and validation group(n=78)according to 7:3 stratified sampling.The two-dimensional ultrasound image of the breast mass with the largest diameter line was imported into 3D Slicer software to delineate the region of interest,and the extended soft-ware package "Radiomics" was used to extract radiomics features of the region of interest.The extracted ultrasound radiomics features were standardized first,and then variance filtering,correlation analysis and LASSO regression were used for feature selection and dimensionality reduction.finally,three machine learning algorithms including Naive bayes,logistic regression and support vector machine were used to construct radiomics model to predict molecular types.The predictive performance of each model was evaluated by receiver operating characteristic curve,and the area under the ROC curve of the three classifiers was compared by De Long test.Results: 1.Molecular typing of 259 breast cancer patients: 54 cases of Luminal A,116 cases of Luminal B,43 cases of HER-2 overexpression,and 46 cases of triple negative.There were no significant differences in age,pathological type,and axillary lymph node metastasis status among the four molecular subtypes of breast cancer patients(P>0.05),but there was significant difference in histological grade(P<0.05).2.In the training group,6,6,8 and 7 optimal radiomics features were extracted from Luminal A and non-Luminal A,Luminal B and non-Luminal B,HER-2overexpression and non HER-2 overexpression,TN and non-TN breast cancer models respectively.3.In the validation group,based on three classifier models,LR,NB and SVM,the AUC values of the Luminal A prediction model were 0.763,0.685 and 0.670,respectively;the AUC values of the Luminal B prediction model were 0.664,0.704 and0.702,respectively;the AUC values of the HER-2 overexpression prediction model were 0.833,0.740 and 0.806,respectively;and the AUC values of the TN prediction model were 0.832,0.650 and 0.868,respectively.4.The Delong test showed that the differences in AUC values between the LR vs NB and SVM vs NB classifier prediction models for triple negative breast cancer were statistically significant(P < 0.05),while the differences in AUC values between each classifier prediction model for the remaining molecular subtypes were not statistically significant(P > 0.05).Conclusion: Based on preoperative ultrasound radiomics features,it has a certain predictive value for the molecular subtype of breast cancer,which is helpful to assist clinicians to identify the molecular typing of breast cancer non-invasively before surgery and to better plan the treatment plan of patients.There was no significant difference in the prediction performance of the three classifiers(LR,NB and SVM)for different molecular types of breast cancer,and the LR model performed better in the overall prediction performance. |