Font Size: a A A

The Study On Methods Of RNA Sceondary Structure Prediction

Posted on:2012-10-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:H DongFull Text:PDF
GTID:1100330332499398Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the gradual deepening study on RNA (Ribonucleic Acid), the important role during the genetic process of RNA is also increasingly significant. The RNA molecules serve not only as a carrier of genetic information in living cells, but also has a number of important functions, such as catalysising RNA splice, processing and modifying the precursors of RNA, regulating the gene expression and so on, it is that which encourages people to do in-depth study of the RNA function. While the RNA functions and structures are closely related, therefore, through the study on the structure of the RNA molecular to fing and describe its function has become an important fleld of research of the Molecular Biology. Because which that uses traditional experimental methods (such as X-ray crystallography and NMR) to determine the crystal structure of RNA is relatively accurate and reliable, but it is expensive and time-consuming. Therefore, it is recognized as the main method at home and abroad which predicts the RNA structure by means of the various algorithms that computer realized.The methods of RNA secondary structure prediction have been studied nearly 30 years, and now there are already many mature algorithms. Some algorithms have been able to achieve high accuracy, such as the algorithm of minimum free energy, the prediction accuracy of which can sometimes reach over 90%, but it can not predict RNA pseudoknot. At present, many other prediction algorithms exist mostly their own problems, such as high time complexity, the limit of the length of the sequence and so on. Therefore, the research of the methods of RNA secondary structure prediction is still the important subjects in the RNA study.In this environment, the paper studys the methods of the RNA secondary structure prediction in depth. The Paper analyzed and summarized the current methods of RNA secondary structure prediction, and then grouped them into four categories: (1) the methods of Comparative sequence analysis (2) the Dynamic programming algorithm (3) the Combinatorial optimization algorithm (4) the Heuristic algorithm. Through the research, analysis and comparison of the four methods, the paper found the research idea of the new prediction method, which has laid a solid theoretical foundation for the completion of the paper work. Firstly, the paper studyed the application of Markov chain in the RNA secondary structure prediction, and proposed the new method of the RNA secondary structure prediction which was based on Markov chain. According to the free energy, the paper build transition probability matrix of Markov chain, and then build RNA-ML, which was used to find the RNA secondary structure with minimum free energy. The paper selected six tRNA sequences from the public database (Genomic tRNA Database) to predict, and compared the results of its prediction with the results of famous software Mfold and RNAStructure. Experimental results showed that the RNA-ML was better than Mfold, and was closer to RNAstructure for a single sequence. Besides, this approach reduced the time complexity, improved the sensitivity and specificity and it executed faster for trna sequences, and it could be also used for a longer RNA sequence, meanwhile, it made up the defects that the time of majority of prediction methods increased to the growth of cubic or puartic with the growth of sequence length.Secondly, the paper studied the application of the Hidden Markov Model in RNA secondary structure prediction and proposed the new method of RNA secondary structure prediction based on Hidden Markov Model. Based on the minimum free energy, the paper established transition probability matrix of the stems and probability matrix of observations, then the paper constructed RNA-HMM which was used to find the RNA secondary structure with minimum free energy. The paper selected 6 RNA sequences with relatively complex structure in the PseudoBase to predict, the prediction results were compared with the results of pknotsRG software. Experimental results showed that the result accuracy of this method was higher than pknotsRG, and the versatility was better than pknotsRG. Besides that, this method cut down the prdicition time, and improved the sensitivity and specificity.Finally,the paper studied the application of the Particle Swarm Optimization algorithm in RNA secondary structure prediction and proposed a new method of the RNA secondary structure prediction based on Particle Swarm Optimization. The paper designed a new fitness function and established the IPSO which was combined with PSO, the minimum free energy, the number of the selected stems and the average length of the selected stems. The paper used RNAPredict, H-Helix PSO and IPSO to predict the RNA secondary structure, and then compared their free energy of RNA secondary structure with each other. The results showed that the free energy of the optimal stem combination predicted by IPSO was lower than that predicted by other methods, and the IPSO could find a more stable secondary structure, and the performance advantages of IPSO for a long sequence was more significant, and it Could find a better secondary structure with fewer iterations , because it had a faster convergence. The paper compared the prediction results of the IPSO with the prediction results of the standard PSO (SPSO), the standard genetic algorithm (SGA), and the ant colony optimization (ACO). The results showed that the IPSO's performance was significantly higher than the other three methods because of the highly efficient objective function. In order to verify the effectiveness of IPSO in the prediction of RNA secondary structure, the paper compared the prediction results of IPSO with the prediction results of Mfold and RnaPredict. The results showed that the sensitivity and specificity of IPSO were higher Mfold for three of the sequences, but the test results were lower than Mfold for the other two sequences, and the sensitivity and specificity of IPSO were higher than RnaPredict for all the sequences, that also proved that the objective function designed in the paper was feasible and more effective.
Keywords/Search Tags:RNA Secondary Structure Prediction, Markov Chains, Hidden Markov Model, Particle Swarm Optimization
PDF Full Text Request
Related items