Font Size: a A A

Prediction Of Transcriptional Interactions Based On Diverse Data Sources

Posted on:2010-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ChengFull Text:PDF
GTID:2120360275970077Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Gene selective expression is an important strategy that cells utilize to adapt various environmental changes. Uncovering the underlying molecular mechanism is always a hotspot in life science research. Recently, several high-throughput experimental approaches have provided researchers with a large amount of data. Several network models and algorithms have been proposed based on one of those data sources. While those methods have their own advantages, they are often complementary and provide only partial information of the regulatory relationships. Thus more and more people have realized that it would be more effective to integrate a compendium of data sources for reconstruction of transcriptional regulatory networks.In this study, we perform detailed analysis and evaluation on two widely-used algorithms, GRAM and MA-Networker, which combine expression data and ChIP-chip data to model regulatory networks. We focus on the selection of thresholds, and point out the negative effects caused by stringent p-value thresholds. Based on those previous approaches, we propose a novel method to integrate heterogeneous data sources for inferring TF-target gene relations. We have applied this method to genome-wide ChIP-chip datasets and transcription factor knockout datasets, and achieved a fairly good prediction. The key aspect of this algorithm is to originally deploy hypergeometric hypothesis in assigning an appropriate p-value cutoff to each of the transcription factors, and to infer reliable regulatory relations based on non-random correlation between the two datasets. The results are validated in comparison to YEASTRACT, high quality ChIP-chip datasets, and other published literatures. We also perform GO enrichment analysis to further validate our predictions. The results show that our method is able to reduce the rate of false negatives without substantially increasing false positive results. Most of our predictions can find experimental or computational evidence in previously published literatures. It should be noted that although we focused on the TF-target gene relations, our method could be easily extended to discover the cooperativity among transcription factors. It could also be used to combine the information from multiple ChIP-chip experiments on the same TF when these data are available. In sum, our work provides a new idea on how to integrate available biological information in a principled fashion.
Keywords/Search Tags:transcriptional regulation, transcriptional regulatory networks, combinatorial regulation, hypergeometric distribution
PDF Full Text Request
Related items