Font Size: a A A

Research On Cancer Classification Based On Deep Fusion Of Multi-omics Data

Posted on:2024-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ZhongFull Text:PDF
GTID:2544306938959139Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cancer,also known as malignant tumors,is a complex disease that poses a great threat to human health and even life.Cancer is extremely heterogeneous,and different types of cancer have different clinical outcomes.In order to help doctors diagnose cancer and alleviate the impact of cancer on human health,it is of great research value to fully utilize human omics data information to classify cancer.With the continuous emergence and promotion of new sequencing technologies in omics,a large amount of omics data has been obtained as a source of data for cancer classification.In the past,researchers mostly used a single type of omics data for cancer classification.However,there are still certain limitations in using single omics data for early screening and diagnosis because a single type of omics data can only reflect changes in one aspect of cancer samples and may also result in information loss,which cannot capture representative information and lead to deviation in people’s understanding of the progress of cancer.On the other hand,multi-omics data integration analysis can reveal relevant knowledge and rules about cancer from multiple perspectives and levels,which may make clinical diagnosis more accurate.Therefore,integrating multiple omics data for cancer classification research has become an important research direction in today’s era.At present,effectively integrating multiple omics data for cancer classification still faces huge challenges,mainly reflected in the high-dimensional and scale differences between multiple omics data,high noise interference of omics data,ignoring the importance of feature differences between samples and unique features presented by different types of omics data in high-level feature space,as well as the difficulty in correlating the correlation and complementarity between different types of omics data.To address these problems,this paper proposes two multi-omics data deep fusion models for cancer classification research.The main research contents are as follows:(1)For the purpose of classifying cancer,a MODILM(Multi-Omics Data Integration Learning Model)model is developed to acquire more significant supplementary information from multi-omics data.Specifically,MODILM first uses cosine similarity to construct a similarity network for each type of omics data,then uses graph attention network to learn specific features of samples and internal correlation features of single omics data from similarity network,unifies them into a new feature space,and further strengthens and extracts high-level specific features of omics data using multilayer perceptron network.Finally,MODILM uses view correlation discovery network fusion high-level specific features extracted from each type of omics data to further learn cross-omics features in label space and provide unique class-level features for cancer classification.The MODILM model proposed in this paper is different from existing models in that it considers the importance of feature differences between samples through graph attention network and better correlates and complements different types of omics data through view correlation discovery network fusion.We conducted comprehensive experiments on four benchmark datasets,LGG-2,LGG-4,SKCM and LUSC.The comprehensive experiments show that the proposed MODILM model outperforms other baseline models and improves the performance of cancer classification.(2)To integrate multiple omics data for cancer classification,a MOGSAM(Multi-Omics data integration model based on Graph Sample and Aggregate and Multi-attention)is proposed.Specifically,MOGSAM first uses cosine similarity to construct a similarity network for each type of omics data,then uses a graph sample aggregation network for feature extraction to learn specific features of samples and internal correlation features of single omics data from similarity network.Then this paper designs a new multi-attention feature fusion method to fuse multiomics data features.Finally,the fused features are input into a multilayer perceptron network for label prediction.Unlike other models,the MOGSAM model focuses on learning aggregators rather than learning feature representations for each node during the feature extraction process,which can improve the generalization ability and flexibility of the model.Moreover,the multiattention feature fusion method can focus on the importance of different omics features and effectively fuse multiple omics data.We conducted comprehensive experiments on four benchmark datasets,LGG-2,LGG-4,SKCM and LUSC.The comprehensive experiments show that the proposed MOGSAM model outperforms other baseline models and effectively improves the performance of cancer classification.
Keywords/Search Tags:Cancer classification, Multi-omics data integration, Deep learning, Graph neural network
PDF Full Text Request
Related items