Font Size: a A A

Research On Tumor Stratification Methods Based On Multi-omics Dat

Posted on:2024-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z S SunFull Text:PDF
GTID:2554306923488794Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In recent years,the development of high-throughput sequencing technology has generated a large amount of omics data,which brings new opportunities for the study of the pathogenesis and diagnosis and treatment of cancer.One of the major problems in cancer research is tumor stratification,which is crucial to the diagnosis of cancer and to the development of individualized treatments.Tumor stratification is the classification of patients into cancer types with the same biological characteristics by using multiple cancer omics data,which provides a valuable basis for understanding the underlying mechanism and treatment of cancer.However,the imperfection of sequencing technology and the high-cost result in the characteristics of high dimension,small sample and high noise of omics data.Therefore,the main problem to be solved in this thesis is how to effectively utilize multi-omics data to more accurately identify cancer types with the same biological characteristics.The research work of this thesis is as follows:(1)In view of the characteristics of high dimension,small tumor sample and high noise,a network-embedded tumor stratification method based on multi-omics data is proposed.In this method,samples are pre-grouped to reduce noise interference,and a network embedding algorithm is used to synthesize DNA methylation,m RNA expression data and PPI network to obtain network topology information.Finally,the method is applied to the feature extraction and patient prediction of multiple cancer data sets to provide more support for the classification of patients with common characteristics into the same cancer type.(2)A random forest-based network-embedded tumor stratification method is proposed for cancer patients with high heterogeneity of pathogenesis and other characteristics.The model constructs gene networks by integrating omics data from heterogeneous cancer samples.Struc2 vec is used to capture gene pairs that are far apart but had similar structures in the network to learn the network topology characteristics of genes.Further,the interference of specific genes is reduced by clustering.Finally,a machine learning algorithm is used to predict cancer patients and classify patients with common characteristics into the same cancer type.Experimental results on a variety of cancer datasets show that the method has a good identification effect.(3)The multi-affinity network integrated tumour stratification method based on multiomics data is proposed to address the problem of small tumour samples and the need for most methods to include the same samples when fusing multi-omics data.Firstly,the similarity between genes is calculated to construct the network,and KNN(K-Nearest Neighbor)is used to reduce the network complexity.Then,the histological data containing a partial subset of samples are integrated by biased random walking.The idea of weight optimization is used to balance the differences caused by the different frequencies of gene mutations.Finally,machine learning algorithms are used to predict cancer patients,and patients with common characteristics are classified into the same cancer type.Experimental results on 12 cancer data sets show that this method can effectively learn the underlying information.The methods proposed in this thesis are all used for cancer type prediction and subtype identification.The findings of the experiments demonstrate that these methods have good robustness and can efficiently extract prospective information from multi-omics data.This method is superior to the existing similar methods by constructing reasonable gene networks and fully integrating multi-omics data.
Keywords/Search Tags:Tumor stratification, Network embedding, Multi-omics data, Network construction
PDF Full Text Request
Related items