Font Size: a A A

A Method For Identifying Pan-cancer Driver Genes Based On Network Model

Posted on:2019-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:H F XuFull Text:PDF
GTID:2370330572955942Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the progress of biotechnology and the development of the related research,people's perception of the function of the gene also reached a new level,and then cancer treatment concept is undergoing fundamental change,namely from the empirical science to evidence-based medicine,from the cells to attack mode to targeted treatment mode.A lot of research has proved that in the cancer occurrence and development process,only a handful of key genes play a decisive role,called it "driver genes",and a lot of genes associated with cancer just "passenger genes",their changes did not caused cancer.Now biological technology can realize to genome sequencing of cancer cells,therefore,how to effectively recognize cancer genes in hundreds or even thousands of changed genes,becomes a problem to be solved.It is of great significance for the effective treatment of cancer.Currently,the methods of studying cancer driver genes can be roughly divided into two categories: one is based on the laws of the statistical method.This method integrates the multiple cancer related data of authoritative database,with the aid of statistical regularity or matrix transformation,to analyze the data.And then the specific research indicators which are significant differences between the genes are detected to be cancer driver genes.This method only focuses on the application of mathematical operations and largely ignores the biological significance of data.Another kind is based on the research of network analysis method.This method integrates cancer sample data onto the biological networks,and then the relevant theory of complex network analysis is applied to the biological network analysis,to assess the degree of importance of each node in the network structure.The most important genes were identified as cancer driver genes.The performance of this method is largely limited by the accuracy and integrity of the biological network information.Based on predecessors' research results,the accuracy of driver genes' identification were affected by the network information's accuracy and completeness too much.So we introduced the known cancer driver genes as prior knowledge to correct the recognition results,and put forward a new cancer driver genes detection algorithm.Experimental data is chose from TCGA database,including many kinds of cancer somatic mutation samples.After quality control and pretreatment,we mapped it to the human gene network,HumanNet,and the network was reconstructed by the method of resampling,so that to extract the pan_cancer gene network.And combining with the related theory of complex network analysis,we evaluated each mutation point's importance of the network.In order to reduce the reliance on the network structure,the algorithm introduced the known cancer driver genes as prior knowledge to correct the scores of each gene.The top score mutation genes were selected as the candidate cancer driver genes.Finally,20 candidate cancer driver genes were identified in this study.Eight of these genes had been documented as driver genes for one or more cancers.Then we analyzed these genes which were not verified and their adjacent,found that most of these candidate genes were closely linked with many known cancer driver genes.It also reflected from the side that these candidate driver genes might have a certain biological significance.
Keywords/Search Tags:pan-cancer, driver gene, TCGA, complex network, genetic network, prior knowledge
PDF Full Text Request
Related items