Font Size: a A A

RNA-RNA Interaction Problems Over Multiple Sequence Alignments

Posted on:2014-08-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:X LiFull Text:PDF
GTID:1260330425485767Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In this thesis, we mainly discuss the prediction of suboptimal canonical joint struc-tures over multiple sequence alignments and the topological properties of RNA struc-tures.In Chapter1, at first we briefly introduce the background of RNA-RNA interaction problems (RIP), and this background comes from the fact that non-coding RNAs (ncR-NAs) usually bind to their targets together to complete the whole regulation process of their genes, therefore it is very necessary to grasp the essential characteristics of the interactions of ncRNAs and their targets. Next we summarize the latest progress con-tributed to RNA-RNA interaction problems in recent years, and discuss their respective characteristics of these prediction algorithms at present. Finally, we outline the main contents of this thesis and present the corresponding results.Followed by Chapter2, we firstly begin to introduce the fundamental concepts for predicting joint structures over two single RNA sequences, such as representation, loop structures, combinatorial properties and energy model of RNA secondary structures. After that, we introduce the definition and the decomposition grammar of joint struc-tures, the calculation of partition function based on the decomposition grammar, and the block probabilities of joint structures over two single RNA sequences. These notions will play essential roles in prediction algorithms over multiple sequence alignments. Although the RNA-RNA interaction problems are argued to be NP-complete [2] if no restrictions are applied to RNA structure motifs, with some restrictions over two single RNA sequences such as the exclusion of zigzag structures, we can still obtain dynamic programming algorithms of polynomial time with the method in this chapter.Chapter3establishes a general framework of RNA-RNA interaction problems over multiple sequence alignments [26], from which we develop a folding algorithm pack-age ripalign implemented by C language. Although it aims to predict canonical joint structure over multiple sequence alignments, it also incorporates all the functions of rip [19][20] into ripalign and outperforms rip. In addition, ripalign also allow struc-ture constraints in the input RNA sequence alignments, which is beyond rip’s reach. In Chapter3, firstly Based on the concept of joint structures defined in [19] and according to the length of stacks and hybrids, we propose the concept of canonical joint structures and compatible joint structures. Afterwards, we study the energy model of canonical joint structures over multiple sequence alignments, and present the corresponding com-putational method. Next we focus mainly on the decomposition grammar, the calcula-tion of partition function and Boltzmann sampling of canonical joint structures as well as base pairing probabilities over multiple sequence alignments. Finally, we discuss, compare and analyze the performances of several other existing prediction packages.In Chapter4, we represent RNA structures again from a topological perspective. At first, we model RNA structures as linear chord diagram and classify RNA structures according to their genus. Next we enumerate the number of shadows of arbitrary genus. Especially, we create a mapping between shadows and shapes of genus2and then obtain a quantity relationship between them. At last, we propose the algorithms of generation of shadows of genus g+1from shadows of genus g.
Keywords/Search Tags:partition function, multiple sequence alignment, Boltzmann sampling, base pairing probabilities, decomposition grammar, joint structure, canonical joint struc-tures, tight structures, RIP, chord diagram, shadow, shape, genus, separating arcs
PDF Full Text Request
Related items