Font Size: a A A

RNA Secondary Structure Graphs With Pseudoknots: Grammars And Topological Classifications

Posted on:2009-09-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:S L GaoFull Text:PDF
GTID:1100360242984577Subject:Computer applications
Abstract/Summary:PDF Full Text Request
The functions of RNA moleculars in biology are mostly determined by its three-dimensional structure, which can be observed by experimental methods such as X-ray crystallography and nuclear magnetic resonance spectroscopy. However, experimental investigations often suffer from the challenge in technique, cost, and other difficulties. Therefore, theoretically predicting the space structure of RNA molecular becomes very necessary, which can improve the efficiency to realize the RNA space structures, guide the experiemtmal investigation.The secondary structure of RNA molecular is the key to predict its space structure. Up to now, minimization free energy (MFE) method is the most conventional for RNA secondary structure prediction. However, finding the MFE structure for a given RNA molecular is NP-complete for the pseudoknots which exist in RNA secondary structures, MFE methods can only control the time complexity in the polynomial level for some special kinds of pseudoknots. The time complexity, which is related to different special kinds of pseudonots, is from O(n~3) to O(n~6). The algorithms with higher time complexity can predict general target, and there do not exist a "good" algorithms with lower time complexity and general target so far. In this paper, we introduce grammars and topological classifications of Rivas and Eddy algorithm class (R&E class) based on linguistics and topology. This thesis includes:1. The concept of the grammar of arc graphs is proposed. At first, arc graph is a graph which illustrates RNA secondary structure, where the information about the nested and crossed base pairs is preserved but that of the unpaired bases is deleted. A small part of an arc graph is replaced by a larger one according to the replacement rule and a larger arc graph is obtained. These processes are continued and a class of arc graphs can be achieved. These simplest arc graphs which embody the structure characteristics are called initial arc graphs. These replacement rules are called rewriting rules. The grammar of arc graphs includes both initial arc graphs and rewriting rules. The arc graphs determined by a given grammar are called languages of arc graphs. The grammar of arc graphs provides a new method to map infinite arc graphs to finite grammar elements. The grammar of arc graphs can be easily extended to define new classes of RNA secondary structures.2. Arc graph grammars of 5 typical RNA secondary prediction algorithms classes are given. In this paper, R&E grammar is defined firstly. Taking the grammar elements of R&E as the whole class, the grammars of PKF, L&P, D&P, A&U classes are defined. Proper inclusive relations are existed between these classes. It is exhibited distinctly through the inclusive relations of the grammar elements. It is possible to grasp the content of these 5 secondary structure prediction algorithm classes through the relations among these classes and the typical elements of these classes.3. Planarity criterion of R&E arc graph is obtained. It is widely required to show RNA secondary structures as a planar. The planar graphical criterion and embedding algorithms are given based on the R&E grammar. The result of this thesis is more general than the previous works. The book embedding and the book embedding number calculation are solved completely in R&E class through the R&E grammar. Some examples are exhibited outside R&E class along with the basic exploring in this area.4. Book embedding classification of R&E arc graph is realized. Book embedding of arc graph is just a concept since it was proposed in 1999 because of its high time complexity (NP-complete). In this thesis, based on the grammar of R&E arc graph, Taking the cross relation graph of an arc graph as a bridge, the book embedding number of any arc graph is calculated through the maximum clique covering. At the same time, the book embedding number is obtained. A traditional NP-complete problem is solved with a polynomial time in R&E class.5. The method for calculating the genus of R&E arc graph is improved. Based on the grammar of R&E arc graphs, a rapid dynamic algorithm for calculating the genus of R&E arc graph is introduced. The time complexity of calculating the genus, the classification standard of directed closed sphere for RNA secondary structures, is improved from linear to constant. This algorithm can be applied to enumerate all arc graphs with small genus and calculating genus of a series of similar arc graphs. It is super to old algorithm in these two aspects.
Keywords/Search Tags:RNA Secondary Structure, Pseudoknots, Grammars, Topological classification, Embedding
PDF Full Text Request
Related items