Research On Cancer Driver Pattern Mining Algorithm Based On Multi-omics Feature

Posted on:2024-06-07

Degree:Master

Type:Thesis

Country:China

Candidate:X Chu

Full Text:PDF

GTID:2554306923488944

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

With the completion of the genome project and the rapid development of high-throughput sequencing technology,a large amount of genomic data has been generated.Among the large amount of cancer multi-omics data,mining genes associated with cancer development has become one of the current hot topics.Most of the existing methods identify cancer driver genes from single histology data,while how to identify cancer driver genes or gene modules using effective information from multi-omics data still needs to be further improved.Thesis integrates cancer multi-omics data and make full use of the histological feature information,structural information and functional information among genes to improve the ability of identifying driver genes and driver modules for better performance in cancer prediction and feature extraction.The specific study consists of three main parts as follows:(1)To address the problem that multi-omics features are underutilized in cancer driver gene identification,this thesis proposes a model based on a machine learning approach to analyze the impact of multi-omics features on identifying driver genes.The method uses the Kullback-Leibler measure to calculate the feature importance of CGC genes and non-CGC genes in four different histological data,and then uses machine learning algorithm to detect cancer driver genes in the pan-cancer data.The prediction results of the method on the pan-cancer dataset validate the effectiveness of the method.In addition,the method can find certain causative genes associated with cancer.(2)To solve the issue that functional and structural information among genes may affect the identification of driver genes,this thesis proposes a network embedding framework for identifying driver genes based on functional and structural information.The method uses a network propagation algorithm to obtain gene function information to construct a mutation integration network that associates genes with weak node information.The structural information features of genes are extracted from the constructed mutation integration network using the struc2 vec model,and genes with similar but distant structures are found by structural similarity.The biological network constructed by this method not only contains functional correlations between genes,but also reflects structural correlations between genes,enabling more comprehensive information to be obtained.Experiments are conducted on a variety of cancer datasets,and the method can effectively identify genes closely related to cancer.(3)Aiming at the problem that gene mutations may affect neighboring genes,a method is proposed to identify cancer driver modules based on network function and topological information.The method uses the mutation impact function to calculate the impact function of the interaction between two mutated genes to obtain the similarity between genes.The adaptive diffusion strength index is used to quantify the degree of influence of a mutated gene on its neighbor genes to obtain the optimal characteristics between genes.This method divides driver genes with the same or similar biological functions into the same module,which more accurately reflects the functional correlation between genes.The experimental results show that the method has an advantage over similar methods in the comparison of robustness analysis and enrichment result analysis,and can more effectively predict the cancer-related driver modules.

Keywords/Search Tags:

Network propagation algorithm, Network embedding, Adaptive diffusion intensity, Protein interaction network, Multi-omics data

PDF Full Text Request

Related items

1	Studies On Identification Of Potential Therapeutic Targets For Complex Diseases And MiRNA Pharmacogenomics Based On Multi-omics Data And Network Models
2	Research Of Prognostic Carcinoma Molecular Subtypes Based On Omics Data
3	Research On Molecular Interaction Network Mining Method For Alzheimer’s Disease
4	Cancer Driver Gene Identification Algorithm Based On Integrated Analysis Of Multi-omics Data And Network Models
5	A Study On Prediction Methods Of Drug Properties Based On Network Diffusion Algorithm
6	Research On Tumor Stratification Methods Based On Multi-omics Dat
7	Research And Implementation On Carcinogenic Gene Module Identification Method Of Glioblastoma Based On Integrating Multi-Omics Data
8	Research On Pattern Mining Method Of Complex Disease Network Based On Multi-omics Data
9	Application Of The Differential Network Method Integrating Multi-omics Data In Breast Cancer Prognosis
10	Study On Pathogenic Gene Detection Based On Complex Network