| There are many kinds of RNA,most of which are not involved in protein synthesis,but can still perform biological functions in organisms.Therefore,the study of RNA function has become one of the important research contents in the field of bioinformatics.The function of RNA is determined by its own structure,and the secondary structure of RNA with different functions will fold into different shapes,so the study of secondary structure will provide help for the functional study.The structure of RNA is unstable and conservative.It is not only expensive,but also inefficient to determine the structure information directly through biological experiments,which makes it difficult to obtain the structure information of RNA.Therefore,bioinformatics methods are needed to predict secondary structure information.RNA secondary structure information prediction is the alignment of RNA to be predicted and RNA with known secondary structure information.Currently,RNA structure alignment is based on RNA secondary structure representation.Therefore,the secondary structure representation can affect the performance of the alignment method.In this context,this thesis studies the RNA secondary structure alignment algorithm.(1)RNA secondary structure alignment based on digital sequence representation.Because the traditional RNA structure alignment algorithm has some shortcomings,such as easy loss of secondary structure information of RNA,we propose a new RNA structure representation method which named digital sequence representation.The secondary structure is converted into a digital sequence according to the definition,on which we propose a new alignment algorithm named DSARna.An alignment matrix is constructed based on the dynamic programming algorithm,and then a binary path matrix is constructed to find the retrospective path in the path matrix,and the optimal alignment result is obtained according to the retrospective path.Specificity,sensitivity and Matthews correlation coefficient(MCC)were used as evaluation criteria,and the data in PDB database were used as test data.Compared with the existing SimTree algorithm,the experimental results showed that this method had higher accuracy and could ensure that the structural information was not easily lost.(2)RNA structural alignment of long,highly complex structures.A new structural representation is proposed for the structural features of long-sequence RNA.According to the definition of long sequence RNA,complex structural transfer to ordered tree model.It is not only convenient for computer calculation,but also can completely represent the structural information of RNA and ensure that the structural information of RNA is not easily lost.The heuristic algorithm is used to calculate the optimal alignment between multi-branched rings.Finally,the weights are adjusted using the neural network algorithm to predict the optimal alignment value and alignment structure.Sensitivity,specificity and MCC value were used as evaluation criteria,PDB and NDB database data were used as test sets,and compared with SimTree algorithm,the algorithm accuracy in this thesis was higher and is especially suitable for long sequence RNA structure alignment with more structural units. |