A Method Of Network Integration For Cancer Subtyping

Posted on:2022-01-02

Degree:Master

Type:Thesis

Country:China

Candidate:Z Q Zhu

Full Text:PDF

GTID:2504306605466974

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the development of high-throughput methods and the reduction of costs,a large number of multi-omics data have been measured.For example,the Cancer Genome Atlas(TCGA)has collected information about the genome,epigenome,transcriptome,and proteome of more than 30 cancers from tens of thousands of patients,and different omics provide complementary and unique characteristics of cancer samples.Compared with single omic analysis,multi-omics integration has significant advantages because they can provide a more comprehensive view of biological processes,reveal the causes and functional mechanisms of complex cancers,and promote new discoveries in precision medicine.Therefore,there is a need for methods that can perform comprehensive analysis of multi-omics data and reliably integrate information generated from different sources to achieve cancer subtyping.In recent years,many integration methods for integrating multi-omic data have been proposed.Some methods have defects in the integration way.For example,LRAcluster is based on a comprehensive probability model of low-rank approximation,which can quickly find the low-dimensional shared main subspace between multiple data types.However,the algorithm directly splices the omics matrices together.Therefore,a certain omic matrix with more features will have a greater impact on the result,which may be inappropriate.Some methods do not consider the heterogeneity of omics data.For example,SNF algorithm uses a Gaussian kernel function with fixed parameters to establish a sample similarity network for each omic data,and does not take into account the possible different distribution of omics.In addition,The KNN algorithm used when SNF constructing similar networks tends to contain noise edges.Some methods need to adjust many parameters,such as i Cluster Bayes,which takes more time.Some methods may lose omics information.For example,PINSPlus separately obtains clustering results for each omic data and then integrates them,which may lose weak information in each omic data separately.Therefore,this thesis proposes a Network Integration based on Multi-Kernel(NI-MK)for cancer subtyping.This method takes into account the heterogeneity of multi-omics data,and the kernel weight coefficient can be learned adaptively according to the omic data without manual setting.Moreover,the consistent KNN algorithm used in this method uses the consistent information of global nodes to make the similarity or dissimilarity between sample pairs more accurate.The method is mainly divided into the following three steps:(1)Using a multi-kernel model to construct a similarity matrix for each omic data;(2)Using the consistent KNN algorithm to construct a local similarity matrix;(3)Using network fusion algorithm to integrate previously obtained similarity matrix.In order to verify the effectiveness of the NI-MK method,this thesis first compares NI-MK with SNF,PINSPlus,LRAcluster,CIMLR,and i Cluster Bayes on the multi-omics data of seven cancers.Experiments have shown that this method can distinguish cancer subtypes with large survival differences on seven cancers.On average,the cancer subtyping result of this method has the most significant differences in patient survival,which is 53.9% higher than the sub-optimal CIMLR method.The silhouette coeffcient of this method is second only to CIMLR.This indicates that NI-MK can identify the most effective cancer subtype,and the clustering effect is also very good.Then,using NI-MK to subtyping cancers on different combination of omics data types of seven cancers.The experimental results show that the cancer subtypes obtained by multi-omics data are more different in survival than the cancer subtypes obtained by individual data types.Moreover,the clinical significance of the cancer subtype identified by multi-omics is 120.8% higher than that of the DNA methylation that is best in single omic.That is,NI-MK can effectively integrate multi-omics data to obtain more clinically significant cancer subtypes.And in most cases,the more integrated omics data types,the better the effect.Finally,using each method to perform clustering experiments on the pan-cancer multi-omics data set.The results show that NI-MK has achieved the highest normalized mutual information(NMI),which is 10.4% higher than the second highest LRAcluster method.The adjusted Rand coefficient(ARI)of NI-MK is also the maximum value,which is 15.7% higher than the second largest SNF method.It shows that NI-MK has high accuracy for the data set with gold label.

Keywords/Search Tags:

multi-omics, cancer subtyping, network integration, multi-kernel

PDF Full Text Request

Related items

1	A Method Of Network Integration For Cancer Subtyping
2	Cancer Subtyping Based On Multi-omics Data Integration
3	Deep Integrated Subtyping Method For Cancer Multi-omics Data
4	Statistical Simulation Comparison Of Multi-omics Integrative Clustering Methods And Application In COVID-19 Subtyping
5	Application Of The Differential Network Method Integrating Multi-omics Data In Breast Cancer Prognosis
6	Research On Analysis Of Cancer Subtypes Based On Multi-omics Data
7	Establishment Of Multi-omics Data Integration Analysis System And Its Application In The Multi-omics Data Integration Analysis Study Of Hepatocellular Carcinoma Cell Lines With Different Metastatic Potential
8	The Research Of Cancer Prognosis Survival Analysis Based On Multi-omics Data Integration
9	Research On Cancer Classification Based On Deep Fusion Of Multi-omics Data
10	Construction And Application Of Bayesian Structural Equation Prognostic Model Based On Multi-omics Integration