Font Size: a A A

An integrative computational analysis of the human transcriptome: Genomic structure, expression patterns and regulatory controls

Posted on:2008-11-16Degree:Ph.DType:Dissertation
University:Boston UniversityCandidate:Halees, Anason ShokryFull Text:PDF
GTID:1444390005465153Subject:Biology
Abstract/Summary:PDF Full Text Request
The sequence of the human genome, once viewed as an insurmountable objective, is seen today as a mere starting point for a number of even more challenging goals. Of these, understanding the mechanisms of gene expression regulation is a top ranking goal. It is becoming increasingly apparent that progress towards these goals will come not only from the plentiful genome wide experimental data sets, but also from an integrative analysis of the data capable of connecting, in a meaningful way, the contrasting aspects assayed in these sets.; In this work, I describe a number of such integrative methods, employed to advance our understanding of some aspects of the human transcriptome, and describe the results obtained from their application. The foundations are laid in chapter 2 which describes PromoSer, a tool that maps the full repertoire of publicly accessible transcripts onto the human, mouse and rat genomes. This mapping reveals a complex network of highly overlapping, interconnected, and extended length loci that get transcribed in a non-trivial way to produce a variety of products far exceeding the loci harboring them.; One of the most salient features of gene organization in all eukaryotes is the exon-intron structure, which implies that transcripts of similar length can have widely varying spans on the genome. In chapter 3, the possible effect of an extended span on the average expression level of a transcript is explored and is surprisingly found to be minimal. Another central feature of any transcript is its site of its initiation, believed to be proximal to many of the more influential elements regulating its expression. The important problem of transcription start site identification is described in chapter 4.; I next discuss various methods used to analyze a large collection of experimental data generated by the ENCODE consortium, the predictions made based on these data and the validation of these predictions. The important mechanism of RNA editing is discussed next and methods used to identify potential sites based on a large experimental data set are described. Lastly, a collection of related algorithms I designed and implemented are discussed in the appendices.
Keywords/Search Tags:Human, Experimental data, Expression, Integrative
PDF Full Text Request
Related items