Font Size: a A A

Gene Co-expression Network Analysis For RNA-seq Datasets

Posted on:2014-06-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J HongFull Text:PDF
GTID:1220330434473358Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Variation in gene expression holds a key to uncovering the mechanism of human diseases. Transcriptome Sequencing, also called RNA-seq, provides us a new technique to quantify whole-genome expression profiling in any organism. It promises digital transcriptome profiling with high resolution and is now fast replacing microarray technology. Gene co-expression network is a systematic approach to capture the important relationship among a group of co-expressed genes and unravel the regulatory processes to gain new insights into phenotype variation. It is defined as an undirected graph, where each node represents a gene and each edge is regarded as the co-expression correlation of two connected genes. The aim of our research is applying and developing computational methods to construct co-expression networks for RNA-seq data and discovering the mechanism of human complex diseases by network approaches. My thesis can be summarized as two parts as follows.First, we applied common network construction method to explore the shared co-expression network between diseases having some similar etiology. Schizophrenia and bipolar disorder were recognized as two most severe psychiatric disorders. Recently, more and more studies supported that there were some shared genetic mechanisms existing between them. Most of the previous co-expression network construction methods only focused on estimating a single gaussian graphical model, which cannot preserve the common structure of networks from heterogeneous data. Here, we adapted a Joint Estimation of Multiple Graphical model to our schizophrenia and bipolar disorder RNA-seq study. Shared co-expression network was explored and co-expressed genes were found to exist in known disease signaling pathways. Our results showed that the two diseases may share common regulation mechanism on network level. Besides, the single gaussian graphical model method can neither detect common structures nor hub genes and related pathways between the diseases.Second, we developed novel computational methods to construct gene co-expression networks which utilized information on exon, genomic positional level and allele-specific expression (ASE) level, respectively. The current methods for co-expression network construction only quantified gene expression as an overall value and overlooked a large number of variations in gene expressions. Although RNA-seq provides the technical method to detect variations in gene expression, it is still a great challenge that how to fully use the comprehensive information contained in RNA-seq data to uncover the mechanism of human diseases. To address the challenges arising from information excavation with RNA-seq data, we proposed new component based methods to infer co-expression networks.To explore expression variations in exons or genomic positions across the genes, an ordinary single variate canonical correlation analysis (CCA) was raised to model exon-level and position-level co-expression networks, respectively. Edges of the networks were defined as the canonical correlations measuring the strength of association between two sets of exon expression or position expression across genes. In examples for exon-level co-expression network construction, the non small lung cancer and uterine corpus endometrioid carcinoma pathways were reconstructed. Key modules of these pathways were rediscovered. In the application of position-level network, two psychotic pathways were inferred, where hub genes and important co-expressed patterns were discovered. Interestingly, the co-expression pattern of PLCB1and PLCB4was detected in both of the two pathways. Moreover, this pattern varied among disease status and normal status networks. To model ASE co-expression network, we developed novel bivariate CCA which considered expressions of two alleles at each SNP loci. The Wnt signaling pathway was reconstructed and hub genes with eQTL locus were identified. We also compared our new methods with traditional co-expression network construction methods. The new methods were superior to traditional methods in many aspects, such as network robustness, network topological properties, similarity with known pathways and hub gene detection.New concepts and models of the three co-expression networks mentioned above are first proposed by our research. They are demonstrated to have remarkable advantages in using information of substantial mRNA variants indentified by RNA-seq. Although our research provides new idea and methods for unraveling the mechanism of complex diseases, the results are still preliminary. Further studies, which may take into account different levels of mRNA variants with other next generation sequencing techniques to construct whole transcriptome regulatory networks, will still face great challenges.
Keywords/Search Tags:gene co-expression network, RNA-seq, common network identification, canonical correlation analysis, allele-specific expression
PDF Full Text Request
Related items