Font Size: a A A

Study On Molecular Subtyping Of Gastric Cancer Based On Multi-omics Data

Posted on:2024-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:G YaoFull Text:PDF
GTID:2544307079491534Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Molecular typing and survival analysis methods are studied based on multi-omics data of gastric cancer patients in this thesis.Upon mining different molecular patterns in the multiomics data of gastric cancer patients,reasonable stratification of patients is carried out.Moreover,by identifying features that are highly correlated with prognosis in multiomics data,the survival time of patients is predicted and it provides a reference for clinical treatment.The data obtained from The Cancer Genome Altas(TCGA)database and Gene Expression Omnibus(GEO)database are preprocessed,and optimal ordinal K neighbors are used to impolate the missing data,all features are standardized to reduce the difference of multi-omics scale.The autoencoder is used to reduce the dimensionality of high-dimensional data,Adam algorithm is exploited to the optimal issue,then the lowdimensional representation of the original features is obtained.These low-dimensional features are integrated by Similarity Network Fusion(SNF)to give a patient-patient similarity network.Based on this network,two gastric cancer subtypes were obtained by unsupervised learning based on spectral clustering,and the subtypes with significant prognostic survival differences are tested by the log-rank test,and the Kaplan-Meier survival curves of patients are plotted.Compared with the molecular typing results after the traditional Principal Component Analysis(PCA)dimensionality reduction,upon the autoencoder reveals the potential nonlinear subspace of the feature,which is conducive to improving the effect of subsequent clustering.The prediction model of cancer subtype is trained based on Support Vector Machine(SVM),and the external dataset GEO is exploited for independent validation.The cancer subtype still have significant survival differences,which proved the effectiveness of the proposed molecular typing algorithm in this thesis.In order to bring out the clinical treatment based on the prognostic risk of patients,the downstream analysis of molecular typing of gastric cancer is carried out here.On using the property of high-dimensional small samples of multi-omics data,the survival time of each cancer subtype based on Lasso Cox was predicted,and the prediction effect of molecular typing and non-molecular typing is compared,the fitting effect of subtype I and subtype II is better than that of whole sample.Then,Weighted Correlation Network Analysis(WGCNA)is used to analyze the relationship between the gene set and the cancer subtype,draw the regulatory network between the genes in the gene set,and identify the key regulatory genes.Finally,we carried out pathway enrichment analysis on the key genes,and explored the related functions and action pathways of different subtype differential genes.
Keywords/Search Tags:Autoencoder, SNF, Life prediction, WGCNA, Log-rank test, Subtype
PDF Full Text Request
Related items