Font Size: a A A

Research On Mining Of Gene Driven Patterns

Posted on:2018-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:J B LuFull Text:PDF
GTID:2334330542960085Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Ovarian cancer is one of the most common gynecologic malignancies and has the highest fatality rate among gynecologic cancers.Researchers should comprehensively understand the molecular mechanisms of Pathogenicity of ovarian cancer prior to exploring the clinical treatment.With the development of high-throughput genomics research,many large-scale cancer genome project has accumulated a large number of genomic data and clinical data of cancer samples,which provides an unprecedented opportunity for a comprehensive interpretation of the molecular mechanisms of ovarian cancer development.Integrating these high-throughput genomic data,including somatic mutation,copy number variation,and gene expression profiling data,remains a very challenging task.At present,there is no robust model that can extract the relevant driving genes and mutation pathways that promote ovarian cancer proliferation from these heterogeneous and strongly related data.Therefore,two methods of integrating genomic data are proposed,which explore the molecular mechanisms that associated with patients from drug response and molecular subtype separately.The followings are the main innovations and research achievements:(1)A new model named DPIM(driver pattern identification model)is proposed to identify driving pattern of drug response in ovarian by integrating high throughput genomics data.Firstly,co-expression network analysis is applied to explore initial gene modules for gene expression profiles via weighted correlation network analysis.Secondly,Mutation network is constructed by integrating the data of CNV and somatic mutation and the candidate modulators are selected from the genes with significant variance by clustering the vertex of the mutation network.At last,regression tree model is utilized for module networks learning algorithm in which the obtained gene modules and candidate modulators are trained for the modulators regulatory mechanism.Finally,the local multiple regression method was used to divide the regulatory genes into two categories:drug sensitive genes and drug resistant genes.According to GO and KEGG enrichment analysis,several biological significance of gene enrichment pathway are obtained,so as to effectively excavate the different regulation models of the two kinds of regulatory genes and the corresponding driving pathway.The experimental results show that the method is feasible.(2)A new model named LVPD(latent variable pattern discovery)is proposed to identify a group of common latent variable of ovarian cancer clinical stages driving pattern by integrating high-throughput data including gene expression,CNV and somatic mutation.Firstly,a joint latent variable model based on clustering method was applied on the three different data types,the data of gene expression and normalized CNV is processed by the Gaussian regression model and somatic mutation data by the logistic regression model.Thus we obtained the best clustering results with the corresponding driven pattern according to adjusting parameters based on cluster separability model.Through the survival analysis with the result of original clinical stages and the joint latent variable clustering,we found the survival curves of the joint latent variable clustering is more accordant with practical circumstances.Finally,some biologically significant gene enrichment pathways was obtained by GO enrichment analysis.It is not only of valuable reference to the treatment and prognosis of ovarian cancer,but also meaningful to explore the mechanisms underlying cancer occurrence and development.
Keywords/Search Tags:Gene co-expression network module, Module network learning, Joint latent variable, Enrichment analysis
PDF Full Text Request
Related items