| Breast cancer(BC)is one of the most common malignant tumors,which always threatens the physical and mental health of women.With the increasing incidence of BC,accurate prognosis prediction of BC patients is the key to current cancer research,and the vast majority of BC-related deaths are caused by distant metastasis of cancer cells.Therefore,the prediction of distant metastasis is an important research content of BC prognosis prediction.Although significant progress has been made in BC research,it is still difficult to predict the risk of distant metastasis and recurrence in BC patients.Therefore,it is urgent to find new biomarkers other than traditional prognostic factors to predict distant metastasis,which is of great significance to the further development of precision therapy for clinicians.Body composition,including muscle and adipose tissue,is increasingly recognized as a key predictor of long-term prognosis in BC.The development of computed tomography(CT)imaging technology has made it possible to assess body composition.With the rapid development of medical technology,the amount and types of medical data are constantly increasing and enriching.Introducing data of multiple modalities into the study of BC distant metastasis prediction can greatly improve the prediction performance.Therefore,how to effectively integrate multimodal data is an urgent problem to be solved in the current study of BC distant metastasis.In view of the above problems,this paper mainly does the following research.First,for the missing values in BC clinical dataset,this paper proposes a class-centered multiple mixed imputation method,which first determines the imputation threshold for each class by calculating the Euclidean distance values of class centers,then imputation the missing values multiple times by using various imputation models,and finally selects the optimal interpolation for each missing sample by using the imputation threshold.This method takes into account the uncertainty of both data and model selection,and makes up for the shortcomings of single imputation and multiple imputation to come up with better imputation results.The experimental results show that the proposed imputation method outperforms the single-model imputation method for each missing rate.Secondly,for CT image data,this paper proposes a module based on multi-scale hybrid attention mechanism,and introduces this module into the bottleneck residual block of deep Residual Network(Res Net),thus obtaining a new and efficient network model.The experiment compares the model with three original Res Net models and two Res Net models with different attention mechanisms.It is found that the proposed attention mechanism module can significantly improve the prediction performance of the model,which indicates that the introduction of The attention mechanism module can strengthen the feature representation ability of the model.Furthermore,the proposed model is visually interpreted using gradient-weighted class activation mapping,revealing that the erector spinae muscle region is the region of interest during network learning.Finally,for the fusion of three modal data,namely,clinical data,CT image data,and body composition radiomics data,this paper adopts a multimodal data fusion method combining feature-level and decision-level for predicting BC distant metastasis,which first employs discriminant correlation analysis for feature-level fusion,and then proposes an improved Stacking model to achieve decision-level fusion.The fusion method was compared and analyzed with the feature-level fusion method and the decision-level fusion method,and it was found that the proposed fusion method reached the optimum in all four evaluation indexes,and the effectiveness of the proposed fusion method was further verified by comparison experiments with six baseline models.In addition,comparing the prediction results of unimodal and multimodal data,it is found that multimodal data can significantly improve the prediction performance of the model,indicating that the fusion of multimodal data can provide more favorable information for the prediction model.This study highlights the value of multimodal data fusion models in predicting BC distant metastasis,and provides an applicable method for subsequent research in this area. |