Font Size: a A A

Method For Predicting RNA Secondary Structure With Pseudoknot Based On GoogLeNet Model

Posted on:2021-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2370330620472175Subject:Computer technology
Abstract/Summary:PDF Full Text Request
RNA plays an important role in the body by participating in many biological processes,such as the expression of genetic information,translation of proteins and gene regulation.The structure of RNA is closely related to its function.Only by determining the structure of RNA can we study the function of RNA in depth.Therefore,it is of great significance to study the secondary structure of RNA.Traditional methods of RNA structure acquisition include biological experiments and computer prediction.Traditional biological experiments have problems such as high cost and time consumption.Therefore,computer methods have become the main research methods at present.The main methods for predicting the secondary structure of RNA include: comparative sequence analysis,dynamic programming,and heuristic algorithms.To some extent,these methods have achieved good results,but there are some shortcomings.In particular,the complex structure of RNA with pseudoknots makes the prediction more difficult,which often leads to poor prediction results.False junctions are a special RNA structural unit which also affects RNA function.Therefore,the prediction of false knots has always been a difficult problem in RNA secondary structure research.Although traditional deep learning methods have achieved good results in predicting RNA secondary structure,but with the increase of the number of network layers,there will be problems such as increasing the number of parameters and overfitting.From the point of view of depth and width of the network,the Goog Le Net model is improved on the basis of the convolutional neural network model,which can extract more feature information and improve the calculation efficiency effectively.Therefore,this paper uses the Goog Le Net model and uses the idea of dynamic programming method to predict the secondary structure of RNA with false knots.In this paper,the existing real RNA data is processed through experiments.The Goog Le Net network model is used to extract valid features from a large amount of RNA sequence data and structural data,and then the extracted features are predicted to obtain the matching probability of each base.According to the prediction result of bases,the definition of the secondary structure of RNA and the idea of dynamic programming method are used to obtain the structure of the maximum probability sum of each base pairing.This structure will be the optimal RNA secondary structure.Firstly,the article evaluates the Goog Le Net model based on 5s RNA and t RNA data,and compares it with other common prediction algorithms.The prediction accuracy of the Goog Le Net model is about 16% higher than the best prediction results of other algorithms.Secondly,the model is evaluated based on tm RNA data.The prediction result from the Goog Le Net model is about 9% higher than the best prediction result of other algorithms.Due to the complex structure of the pseudoknot,the prediction accuracy of the latter is low,but this method lays the foundation for the subsequent study of RNA secondary structure.In addition,the performance of deep learning algorithms is related to the size of the data set,and it can be inferred that as the amount of RNA data increases,the accuracy of the deep learning method to predict the secondary structure of RNA will also be improved.
Keywords/Search Tags:RNA secondary structure, false knot, GoogLeNet model
PDF Full Text Request
Related items