Font Size: a A A

Construction And Analysis Of CCA Gene Co-exprsssion Network Based On RNA-seq Data

Posted on:2015-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:W C ChenFull Text:PDF
GTID:2180330422490892Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Gene co-expression network is a kind of biological network which is made upof genes as nodes and relationship between two genes as edges. It is used to findnew oncogenes and cancer subtypes. Most of the existing methods using geneexpression data to calculate the relationship between genes. However, with thedevelopment of second-generation sequencing, people can get more granular data:exon expression data which is contained in a gene, ie RNA-seq data.This require anew method to build the network. Canonical Correlation Analysis is one of them. Itconsiders a gene as a vector,a coordinate is an exon contained in the gene. With thisadvantage, it results in a more accurate network. Besides, there proposed animprovement in both part of data pre-processing and analysis of the network.The section on data preprocessing involves data normalization and hypothesistesting: T-test, Wilcoxon rank test and KS test. The purpose of hypothesis testing isscreening out exons that are not significant between normal and tumor samples.With this step, the amount of calculation is simplified and the result will be moreaccurate. In the analysis step, we apply the network to finding significant pathways:calculating each pathway’s CPCC value with normal and tumor networks,the lessthe result is,the more significant the pathway is.In order to prove the validity of the method, Breast cancer RNA-seq data isused to construct the network, then we demonstrate the significance of the top20pathways. Firstly,12of them are associated with breast cancer, as evidenced byrelevant literature. Secondly, the result of hierarchical clustering on genes of apathway shows that two groups are distinctly separated. Furthermore, these CPCCvalues has probability values less than0.2in normal distribution of random CPCC.All of these results indicate that our approach is rational.At last, we constructed a web site. It displays not only the gene co-expressionnetwork built by CCA but also the analysis results from CPCC. It provides a varietyof interactions like network zoom, pan, etc. It has friendly interface and easy to use.
Keywords/Search Tags:Gene Co-Expression Network, RNA-seq, Canonical CorrelationAnalysis, Cophenetic Correlation Coefficient
PDF Full Text Request
Related items