Font Size: a A A

Computational and functional analyses of splicing regulation

Posted on:2011-06-12Degree:Ph.DType:Dissertation
University:Carnegie Mellon UniversityCandidate:Papasaikas, PanagiotisFull Text:PDF
GTID:1440390002466298Subject:Biology
Abstract/Summary:
Recursive splicing mediates alternative exon exclusion and stepwise removal of long introns via coincident 3' and 5' splice sites. I used state of the art computational methods to develop an accurate, unbiased model for Drosophila Recursive Splice Sites (RSSs). The resulting model reveals a large motif with distinctive features and high information content. This model exhibits high sensitivity, especially for non-exonic RSSs, and better discrimination against regular splice sites. The inferred model is supported by a large number of experimentally verified instances. The RSS-model predictions exhibit striking enrichment in long introns and reveal strong conservation of non-exonic sites over >60 MY. The distinctive features of the inferred model are shared by splice sites of long introns that abut short exons, suggesting mechanisms to aid 3'ss recognition, avoid interfererence by closely spaced 5' splice sites, and coordinate the sequential action of RSSs. Using the same model I identified strong RSS motifs in the non-coding strands of several retroelements, including telomeric non-LTR elements of Drosophila and lepidoptera. I found numerous RSS predictions associated with fragments of these retroelements in repetitive portions of the Drosophila genome as well as signs of recent non-LTR insertions carrying RSSs into single-copy regions. This suggests a history of simultaneous intron enlargement and acquisition of advantageous RSSs as a result of ancient and ongoing transpositions. Computational analysis of the landscape of candidate cis-acting elements around non-exonic RSSs revealed an underrepresentation of predicted intronic silencers and an overrepresentation of predicted intronic enhancers in the neighborhood of RSSs. In contrast, I demonstrated the presence of strong and highly conserved branch points upstream the RSS motif. Finally my analysis reveals that most non-exonic RSSs are associated with downstream 5'ss motifs at a position where they would be expected to define an exon, but where regular 5' splice sites exhibit a peak of enhancers. Bioinformatic analysis of the defined pseudoexons along with detailed experimental analysis of one RSS example by other members of the Lopez laboratory suggest that the downstream pseudo-5'ss does not normally define an exon and instead functions as part of a conserved enhancer module that stimulates use of the 5'ss regenerated by the RSS. This type of modules might prevent inappropriate use of competing alternative and cryptic sites and/or assist the efficient and correct sequential activity of many RSSs.The final chapter of this dissertation describes a computational framework for the reconstruction of Splicing Regulatory Networks from high-throughput transcriptome data. Relative levels of exons in gene transcripts at different developmental stages and/or physiological contexts are statistically inferred and used to identify alternative splicing events. In turn this information is used as input for state-of-the-art methods for Graphical Model Selection in order to recover the structure of the underlying splicing regulatory network. Distinct modules within the network are identified using community structure detection methods and Social Network Analysis is used to identify key within-module actors. The identified network modules are finally correlated to different developmental and functional categories by analyzing their time series and Gene Ontology enrichment profiles. As a proof of concept for this framework I studied the splicing regulatory network for Drosophila development using the publicly available modENCODE genome tiling array data. I was able to identify distinct network modules associated with major developmental hallmarks including maternally loaded RNAs, onset of zygotic gene expression, transitions between life stages and sex differentiation. The identified within-module key actors include well-known developmental-specific splicing regulators. Additional factors previously unassociated with developmental-specific splicing were also highlighted by this analysis.
Keywords/Search Tags:Splicing, Splice sites, Long introns, Computational, RSS, Rsss
Related items