Font Size: a A A

Studies On Algorithms And Complexity Of RNA Tertiary Structure Prediction Based On Machine Learning

Posted on:2022-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y R YangFull Text:PDF
GTID:2480306770467874Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Ribonucleic Acid(RNA)is a carrier of genetic information,which mainly exists on some viruses,viroids and biological cells.RNA performs many complex biological functions of organisms,such as sensing the concentration change of metabolites,playing a catalytic role and regulating gene expression,etc.The expression of these functions depends on its tertiary structure,so the research on the RNA tertiary structure has become an significant research topic.The conformational number of RNA increases exponentially with the increase of nucleotide number,and the RNA structures determined by NMR,cryo-electron microscopy and X-ray diffraction are relatively few,with low efficiency and high cost.Therefore,a high-precision RNA tertiary structural prediction algorithm based on bio-computing becomes a necessary choice.Current popular RNA tertiary structural prediction algorithms include knowledge-based RNA tertiary structural prediction algorithm and physics-based RNA tertiary structural prediction algorithm.These two kinds of prediction algorithms have their own advantages and disadvantages,but neither of them has achieved high-precision and high-integrity RNA tertiary structure modeling.Therefore,the research direction of this thesis is to further optimize and improve the related prediction algorithms.The main difficulty of RNA tertiary structure prediction lies in the construction of its conformation sampling and scoring function.For the problem of conformation sampling,the appearance of Rosetta framework provides a new idea for RNA conformation sampling.The RNA tertiary structural prediction algorithms based on enumeration sampling and random sampling scheme under Rosetta framework effectively improves the ability of conformation sampling.For the scoring functions,machine learning related methods overcome the disadvantages of inaccurate scoring of traditional scoring functions.The scoring function of RNA structure based on 3D convolutional neural network not only improves the quality of structure scoring,but also improves the tertiary structure prediction accuracy of RNA to a certain extent.In summary,for the RNA tertiary structural prediction algorithm,the main work of this thesis is as follows:1.Aiming at the problem that the sampling ability and sampling cost of traditional sampling methods are difficult to achieve the best compromise,this thesis proposes and designs an RNA tertiary structure prediction algorithm based on random sampling strategy and parallel mechanism(Stepwise Monte Carlo Parallelization,SMCP).The SMCP algorithm uses a random sampling scheme to search the conformation space,which effectively reduces the sampling cost.At the same time,SMCP uses a parallel mechanism to search for conformations,which improves the sampling breadth and sampling efficiency.In addition,two rounds of different potential energy evaluations are used to improve the level of structural evaluation and reduce the construction cost.Finally,in order to solve the problem of incomplete modeling caused by random sampling,the SMCP algorithm further judges and optimizes the modeling results,and finally achieves high-precision and high-integrity RNA tertiary structure modeling.2.Aiming at the problem that the scoring function based on the minimum free energy is not suitable for large RNA scoring,this thesis proposes and designs a Res Net-based RNA tertiary structure prediction(Res3DScore)algorithm.The Res3DScore algorithm takes the RNA 3D grid structure as input,and uses 3D convolution to learn the 3D structure information of the natural conformation and other candidate conformations.The optimized Res3DScore algorithm can score all candidate structures of RNA,and finally output the current structure with the surrounding nucleotide environment.The Res3DScore algorithm uses a 3D convolutional neural network to further improve the RNA tertiary structure scoring ability and the accuracy of RNA tertiary structure modeling.3.Predictive algorithms and its computational complexity analysis.An important indicator of algorithm evaluation is the time and space complexity of the algorithm,especially its time complexity.This thesis analyzes the time complexity of the two algorithms designed respectively.The time complexity of the SMCP algorithm is O(n~4),and the time complexity of the Res3DScore algorithm is O(n~2).By analyzing the algorithm complexity,it is further proved that the algorithm designed in this thesis can not only achieve high-precision RNA tertiary structure modeling,but also the time complexity of the algorithm is also within an acceptable range.4.Overview of RNA tertiary structural prediction algorithms.In this thesis,typical RNA tertiary structural prediction algorithms are analyzed,and the advantages and disadvantages of each algorithm are summarized in order to provide ideas for the improvement and optimization of RNA tertiary structural prediction algorithms.In addition,this thesis proposes a detailed classification for physics-based RNA tertiary structural prediction algorithm,which is based on RNA conformation sampling method.In this thesis,the conformational sampling method and scoring function in the process of RNA tertiary structure prediction are deeply studied.Firstly,this thesis improves the traditional sampling method,which improves the sampling breadth and efficiency,and reduces the sampling cost.In addition,this thesis also innovates the traditional scoring function,and the structural potential energy evaluation is more rigorous.In this thesis,it is verified by experiments that the SMCP algorithm and the Res3DScore algorithm can improve the modeling accuracy and completeness of RNA tertiary structure modeling,and provide a basis for deeper RNA structure prediction research.
Keywords/Search Tags:RNA tertiary structure, prediction algorithms and complexity, conformation sampling method, Rosetta framework, 3D convolutional neural network
PDF Full Text Request
Related items