Font Size: a A A

Completeness And Algorithms For RNA Secondary Structure Prediction Problem Restricted On Planar Graph

Posted on:2023-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2530306617455204Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
When an RNA sequence bends,several hydrogen bonds generated between non-adjacent bases constitute the secondary structure of RNA,which has an important impact on its function,so the problem of RNA secondary structure prediction is one of the hot topics in the field of computational biology,which aims to calculate the most likely secondary structure of RNA strands according to the natural rules and principles in biology.The rule most frequently used is the lowest energy rule,i.e.,the lower the energy,the more stable the RNA secondary structure.Non-adjacent bases in the RNA sequence can form base pairs,and when two base pairs are parallel and adjacent,they will form a stack where their energy is lowest.This led molecular biologists to propose the maximum stacking base pairs(MSBP)problem with the number of stacking base pairs as the optimization goal.A pseudoknot occurs when two sets of base stacks cross over.Pseudoknots do not exist in the simplest RNA secondary structure,and the optimal secondary structure is polynomial time computable in this situation,but pseudoknots exist in the majority of real RNA secondary structures,which make the problem complex.A planar RNA secondary structure is one of which all base pairs do not cross over when embedded in a planar graph.Pseudoknots may exist in such planar graph,but no crossover of more than three sets of stacks occurs.This thesis investigates the MSBP problem with planar RNA inputs,where the connection between probable base pairs in the input do not cross each other and those arcs and bases form a planar graph,and has proven that it is NP-hard through a reduction from the maximum independent set problem on a 3-regular planar bridgeless connected graph.Then we propose an O(1.755")exact algorithm for this problem.We also propose a heuristic algorithm 2s-DP for this problem with a time and space complexity both of O(n·16s)in this thesis,where s is the length of the longest spanning arc in the sequence.We simulate the performance of this heuristic algorithm on computer for a randomly generated RNA sequence input,and find that the larger the difference between the length n and the maximum span s of the arc is,the faster its computation speed is compared to the exact algorithm.This thesis complements the study of the RNA secondary structure prediction problem,and provides innovative ideas for design of related algorithms.
Keywords/Search Tags:computational biology, RNA secondary structure prediction, NP-hard, heuristic algorithm
PDF Full Text Request
Related items