| Recent studies have shown that the interaction between Protein and ATP is closely related to many human diseases.Therefore,accurate identification of binding sites between Protein and ATP has gradually become an important basis for new drug design.In order to improve the prediction accuracy,this paper proposes two different classification frameworks to predict Protein-ATP binding sites by improving and fusing machine learning methods such as deep convolution neural network and support vector machine.In order to reduce the negative effect of imbalanced learning problem during the training process,a prediction method based on word vector convolutional neural network is proposed.Firstly,the position specific score matrix feature,secondary structure feature,solvent accessible surface area feature,amino acid encoding feature and physicochemical property of residues in protein sequence are extracted.Then the data are cleaned by the Repeated-Edited-Nearest-Neighbours method,and the sample imbalanced problem is solved by random down-sampling.The input data is encoded by word vector.Finally,the improved word vector depth convolution neural network model is applied and the performance of proposed method and relevant prediction methods are compared.The experimental results show that the prediction method proposed in this paper can predict the Protein-ATP binding sites more accurately.In order to optimize the model performance of the classifier and further improve the prediction accuracy of Protein-ATP binding sites,a prediction method(Inception V3_SVM)based on deep convolutional neural network and support vector machine is proposed.Firstly,the feature extraction from protein sequence is performed to obtain the standard input tensor,and the depth convolution neural network model(Inception V3)is used to expand the receptive fields of input data.The deeper feature extracted by the convolution neural network is then trained by the Support Vector Machine(SVM)classifier,and the final prediction result is obtained. |