Font Size: a A A

Computational inference of genetic regulatory networks in human cancer cells

Posted on:2009-04-27Degree:Ph.DType:Thesis
University:Columbia UniversityCandidate:Margolin, Adam ArneFull Text:PDF
GTID:2444390002495631Subject:Biology
Abstract/Summary:
The dysregulated activity of oncogenic transcription factors contributes to neoplastic transformation by promoting aberrant expression of target genes involved in regulating cell homeostasis. Therefore, characterization of the regulatory networks controlled by these transcription factors is a critical objective in understanding the molecular mechanisms of cell transformation. Modern high throughput technologies are providing the first window into regulatory processes on the genome-scale, foretelling the ability of computational inference algorithms to produce models of regulatory networks that will revolutionize our understanding and treatment of cancer biology by (1) describing how genomic alterations cause functional disruptions in the network regulating cell homeostasis, leading to aberrant cell growth and cancer, and (2) predicting therapeutic interventions, in which critical components of the network can be targeted to revert the cancer phenotype.; This thesis will develop methods that advance the current state of the art in inferring transcriptional regulatory networks from high throughput data, with specific application to both gene expression and ChIP-on-chip data. Prior to this thesis, several methods had been proposed to infer regulatory networks from microarray data; however, these methods were applicable only to model organisms, such as yeast, due to high computational complexity. Moreover, all methods relied to some extent on various assumptions that are not biologically realistic. Here, I will develop a novel method, based on information theory, that overcomes these limitations in that it has low computational complexity, allowing application to mammalian systems, and makes minimal assumptions about the structure of the network or about the type of statistical interaction between genes (e.g. linear models). I will apply this method to reconstruct the first genome-wide regulatory network inferred from microarray data for mammalian cells, and further demonstrate how this method can be used to deduce regulatory interactions between subnetworks controlled by different oncogenes, using only microarray data. I will extend this analysis, again using the tools of information theory, to consider inference of interactions involving more than two variables. To do so, I provide a rigorous definition of statistical dependency in the multivariate setting, which previously had not been done. I demonstrate that this framework effectively identifies groups of genes that interact in a pathway to jointly regulate a common set of targets. While the microarray analysis methods are motivated by issues specific to inferring gene regulatory networks, the resulting algorithmic advances are novel from a purely mathematical/computational perspective, and should be generally applicable to reverse engineering networks from measurements of the interacting variables, which is a general problem both in other branches of systems biology (e.g. metabolic networks, neural networks), as well as scientific applications outside of systems biology (e.g. social networks, electrical networks).; In the second part of the thesis I consider analysis of ChIP-on-chip experiments, which is a new technology that more directly measures transcription factor-chromatin interactions. I show that existing methods to analyze these data are not able to assign meaningful statistical significance scores (p-values) to bound promoters, due to a number of flawed assumptions. I then develop a data driven method that accurately predicts the extent of TF/DNA binding, and reveals an order of magnitude more interactions than previous methods. When combined with DNA sequence and gene expression data, I will demonstrate how application of this method can deduce regulatory networks of substantially greater complexity than previously appreciated. Moreover, I use this method to analyze the interaction between regulatory networks controlled by two important proto-oncogenes (MYC and NO...
Keywords/Search Tags:Regulatory networks, Gene, Cell, Method, Computational, Cancer, Inference, Data
Related items