| Cancer is a heterogeneous disease,including a variety of pathogenesis and clinical features.According to the different prognosis and treatment methods of patients,the same cancer can be subdivided into multiple subtypes.Cancer subtype prediction is of great significance for accurate diagnosis and personalized treatment of patients.In recent years,the rapid development of high-throughput sequencing technology has accumulated a large amount of multi-omics data,making it possible to determine the subtype based on the multi-omics data of patients and then formulate personalized treatment plans.Although researchers have developed some models for cancer subtype prediction based on multi-omics data,there is still a lot of room for improvement in prediction accuracy due to the high dimensionality and diversity of such data.In addition,the existing models widely have some shortcomings,such as weak generalization ability and weak interpretability.Therefore,there is an urgent need for a model that can effectively fuse multi-omics data to achieve cancer subtype prediction.In this thesis,we propose a Multi-Omics integration model based on Evidential Ddeep Learning(MOEDL)to detect cancer subtypes.First,for the four omics data of gene expression,mi RNA expression,DNA methylation and protein expression,data preprocessing and feature selection are carried out by using K nearest neighbor to fill missing values,data standardization and feature selection based on mutual information.Then a deep neural network is constructed based on the processed data to obtain the initial classification results and uncertainty estimates of the samples.Finally,the Dempster-Shafer evidence theory is used to fuse the initial classification results calculated based on multi-omics data to obtain the final classification results and overall uncertainty.In addition,we also introduce weighted random sampling and Dropout methods to prevent model overfitting.In order to evaluate the predictive performance of MOEDL,we performed 10-fold cross-validation on three cancer datasets and compared the performance with block PLSDA,block s PLSDA,MOGONET,and Mo GCN models.The comparison result shows that MOEDL has the best predictive performance on the three cancer datasets.The performance of MOEDL is analyzed on the basis of using only single omics data,using two omics data,using three omics data,and using four omics data.It has been discovered that the model based on multi-omics data will perform better.We also verify the effectiveness of MOEDL components based on the idea of ablation experiments.In addition,the robustness and rationality of the MOEDL model are verified by adding perturbations to the omics data.Finally,we identify cancer biomarkers based on connection weight sensitivity analysis and explore the prognostic value of these biomarkers. |