Font Size: a A A

RNA Tertiary Structure Prediction

Posted on:2018-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiFull Text:PDF
GTID:1360330572968840Subject:Physics
Abstract/Summary:PDF Full Text Request
RNA is a versatile biological macromolecule of great importance.Besides the well-known function of coding proteins,about 97%RNAs in eukaryotic cells do not code proteins,and are called non-coding RNAs.These non-coding RNAs play an active role in controlling gene expression,catalyzing biological reactions,transmitting signals between cells,etc.Their three-dimensional structures are generally needed for people understanding their functions as structures determine functions.At the present stage,X-ray crystallography and nuclear magnetic resonance spectroscopy have been employed to determine RNAs'3D structures.Such experiments,however,are costly,time-consuming and also technically challenging.Moreover,with the development of RNA sequencing methods,RNAs with known sequences abound and are far more than RNAs with known structures.These facts urge the computational methods to be developed.In our work,we dealt with tertiary structure prediction of RNA loops and developed a new scoring function,hoping to help address difficult issues in RNA tertiary structure prediction.The prediction of tertiary structures of RNA loops,as part of the effort of tertiary structure predic-tion,merits specific attention.First,this is because RNA functions often reside in loop regions,and about 46%nucleotides in an RNA chain with sequence length longer than 30 are estimated to remain unpaired and stay in loop regions.Second,the high flexibility of loops and inaccuracy in energy function make the prediction accuracy of loops much lower than that of helical regions.Third,some people focus on the relative position and orientation of helices and do not take loops into consideration.Thus we developed a method named RNApps which can help rebuild the structures of loop regions.It includes a proba-bilistic coarse-grained RNA model,a sequential Monte Carlo growth algorithm,a simulated annealing strategy and also an all-atom statistical potential.The probabilistic nature of the model allows a contin-uous sampling in the conformational space and therefore is able to cover all the relevant conformations,as is difficult for fragment assembly methods or discrete state models.Moreover,the coarse-grained model further increases the efficiency.Our method can handle all kinds of loops,including hairpin loops,internal loops,multi-way junction loops,and other types of missing fragments.As an important part of structure prediction,scoring function determines the efficiency and accuracy of structure prediction.Thus my second work is to build a new scoring function based on artificial neural network by learning from the structures compiled in training dataset.As an important branch of machine learning,artificial neural network has made many breakthroughs with booming development.We expect this powerful tool can help deal with the difficult issues in structure prediction.According to the different input features of neural network,two scoring functions were built,one of which is coarse-grained and mainly related to the distance probability distribution of two bases and the other one is atomistic and mainly related to the distance probability distribution of two atoms.The atomistic one performed better than the coarse-grained one,but the latter can save a lot of computation time.This dissertation brings forth several innovations:?)We introduced a new prediction method of tertiary structures of RNA loops which can help prediction of the whole RNA structure.?)This new method can sample the conformations in continuous 3D space based on a probabilistic sampling strategy to remove the limits associated with fragment assembly methods due to discrete nature.?)The frame of sequential Monte Carlo growth algorithm can easily incorporate the information from experiments or users' experience into the method,which can improve the prediction efficiency and accuracy.?)We also built new scoring functions based on artificial neural network method.?)The new scoring function can make full use of the structural information of native and non-native RNAs.vi)Artificial neural net-work can learn the parameters from the training dataset automatically and the reference state problem in traditional statistical potential can be avoided.This dissertation is organized as follows:? Chapter ? is a general introduction to the background and importance of our research topics,as well as the basic concepts and knowledge.? In Chapter ?,we introduce a new prediction method of tertiary structures of RNA loops.? In Chapter ?,we introduce a new scoring function for RNA tertiary structure prediction and as-sessment based on artificial neural network.? Chapter IV is a summary of this dissertation.
Keywords/Search Tags:RNA loop, tertiary structure prediction, probabilistic sampling, sequential Monte Carlo, artificial neural network, scoring function
PDF Full Text Request
Related items