Font Size: a A A

The Research Of Cancer Prognosis Survival Analysis Based On Multi-omics Data Integration

Posted on:2024-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:R P WuFull Text:PDF
GTID:2544307148992759Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Multi-omics data integration can link different molecular features at different levels,which is a key breakthrough for improving cancer diagnosis,treatment and identification of cancer biomarkers.The development of deep learning technology has provided powerful technical means for multi-omics data integration.In this thesis,using m RNA expression data,mi RNA expression data,Copy Number Variation(CNV)and clinical data,survival analysis models based on deep learning and multi-omics data integration are proposed,achieving more accurate performance for cancer survival prediction.The main research work of this thesis is as follows:(1)A multi-omics early integration model based on autoencoder and XGBoostm RMR feature selection is proposed to alleviate the feature redundancy and "super matrix" issues in multi-omics data integration.Based on the early integration strategy,we proposed three survival analysis models: AE-m RMR-RSF,AE-m RMR-GBM,and AEm RMR-Cox-EN,which is respectively combined with Random Survival Forest(RSF),Gradient Boosting Machine(GBM)and Cox Elastic Net(Cox-EN).Experimental results showed that the proposed models achieved superior performance on 8 cancer types datasets,compared with single omics data and baseline survival analysis models.The results of biological function analysis showed that the genes MPPED2 and PSMB9 with larger weights were found to have the effect of distinguishing high and low risks,and they are effective prognostic biomarkers for grade III of Low-Grade Gliomas(LGG).It shows that the proposed models not only extract gene features that have an important impact on survival probability,but also significantly improve the performance of survival analysis by multi-omics data integration.(2)A survival analysis model VAESCox based on multi-omics data and sparse variational autoencoder is proposed,combining the variational autoencoder with sparse coding and Cox survival analysis model to mitigate the “dimensionality curse” and overfitting problem in multi-omics data integration.Related experiments are designed to verify the performance of the model.The results show that the model based on the integration of m RNA and mi RNA data achieves better performance,and further adding CNV data into the integration,the performance advantage of model is still significant.It indicates that the model has good applicability for the integration of different omics data.(3)A multi-omics integration model GCN-VCDN-Cox for survival analysis of cancer prognosis is proposed,using the Graph Convolutional Network(GCN)and View Correlation Discovery Network(VCDN)to explore sample correlations described by similar networks and cross-omics feature correlation in the hidden-layer feature space.Related comparative experiments and ablation experiments are designed,and good results have been achieved.Through GO pathway enrichment analysis,protein interaction network,and other biological analysis,it shows that 7 significantly highly expressed genes,COL1A1,COL6A1,GREM1,IRX1,LAMA3,POSTN,and SULF1,are found to have specificity in the pathogenesis of lung adenocarcinoma(LUAD).The results show that the model can not only improve the performance of survival analysis by the multiomics data integration,but also identify important biomarkers related to cancer prognosis.
Keywords/Search Tags:multi-omics data integration, cancer prognosis, survival analysis, autoencoder, graph convolutional network
PDF Full Text Request
Related items