| Protein-RNA complexes participate are involved in many important processes of life activities,such as gene replication and expression,signal transduction,regulation,and metabolism.Many let,hal cell functional disorders and diseases are caused by abnormal protein-RNA interactions.Therefore,studying the interaction between proteins and RNA is of great significance in understanding the functions of RNA-binding proteins and designing new drugs.Many experimental methods have been developed to study the structure of protein-RNA complexes and determine RNA binding sites on proteins,such as X-ray crystallography,nuclear magnetic resonance,and so on.However,experimental methods are usually time-consuming and expensive,so it is necessary to study accurate and efficient computational methods to predict their binding sites.With the development of neural network technology,it has gradually been applied to many fields of bioinformatics.In recent years,with the increasing demand for graph structure data analysis in biological problems,many studies have applied graph neural networks to the problem of predicting protein functional sites.This paper attempts to apply the graph attention network and graph convolutional neural network algorithm to the prediction of RNA binding sites on proteins.By fusing the graph attention network and graph convolutional neural network model to extract the structural information of proteins and predict their RNA binding sites.The test set selected in this study comes from the original structure dataset of proteins that have not bound to RNA,which is more realistic than the test set used in similar studies before and has more reference value.This paper proposes a new deep learning framework,called GAT-GCNII,for predicting RNA binding sites on proteins.The algorithm constructs agraph based on the spatial relationships between amino acid residues,and constructs its feature matrix by extracting the sequence and structural information from corresponding protein,innovatively ombines the graph attention network and graph convolutional neural network according to a certain weight,into a module,which is called the GAT-GCNII module in this paper.The module aggregates feature information around the nodes,while using attention mechanism to strengthen the critical information,and effectively alleviates the problem of oversmoothing.In order to verify the advantages of the GAT-GCNII module,this paper compares its performance on the same dataset with models that only use graph attention network or only use graph convolutional neural network as a module.The experimental results show that the GAT-GCNII module has significant advantages in multiple evaluation metrics.Furthermore,this paper also explores the influence of hyperparameters in the GAT-GCNII module on the predictive performance of the model,and selects the best model with the best predictive performance based on evaluation metrics.Compared with previous methods on the test set,the results show that the GAT-GCNII model surpasses previous prediction tools in all selected evaluation metrics.In addition,this paper also evaluates the importance of various feature information used.Specifically,after removing a certain feature information from the feature matrix,the remaining feature information is used as the input for testing,and the impact of this feature on the evaluation metrics for predictive performance is analyzed.The results found that all the features selected in this paper played an important role in model for prediction. |