Drug are one of the most effective ways for humans to fight diseases,but drug discovery is a comprehensive project that is very labor-intensive and resource-intensive.Drug discovery has the characteristics of long cycle,high investment and high failure rate.According to statistics,the average cost of a small molecule drug is $1.8 billion,and it takes 12 years.The identification of drug-target interactions is the key to develop high-specificity and low-toxicity drug.However,limited by human and material resources,traditional biological experiments are difficult to achieve large-scale and high-throughput screening.Benefiting from the development of information technology,the method of computer-aided prediction of drugtarget interactions has been realized and widely used.Computer-aided drug-target interaction prediction guides the discovery and modification of lead compounds,reducing blindness in drug development and speeding up the process.In addition,drug-target prediction can identify potential protein targets for existing drugs,thus discovering their new indications.Therefore,it is very important to develop efficient and accurate prediction algorithms.The key to drug-target interaction prediction based on deep learning lies in accurate and complete information representation and feature extraction of drug and target.In view of the existing problems of drug and protein representation in the existing algorithm,this paper mainly does two works:(1)For the characterization of drug and protein structure information,this paper proposes a drug-target affinity prediction algorithm based on deep learning and multi-level information fusion.Firstly,the drug was represented as a molecular graph and the graph convolutional neural network was used to learn the topological structure of the drug.At the same time,the extended connectivity fingerprint was extracted as a supplement to the drug structure information.Secondly,the protein sequence and K-mer features are input into the convolutional neural network module and the fully connected layers respectively to learn the latent features of proteins.Finally,the features learned from the above four channels are fused,and then the fully connected layers are used for prediction the affinity.In this paper,the validity of this method is tested on two benchmark datasets with other models.The result shows that the proposed method has a preferable learning ability,which proves that the strategy of integrating multi-level information of drugs and proteins is effective.(2)A new small molecule secondary structure representation network is proposed,which improves the above graph neural network method to accurately characterize drugs.Firstly,this network performs new graph representation of drug molecules,using more accurate graph convolution to learn the topological structure of drugs.Secondly,convolutional neural networks are used to learn drug sequence characteristics.The network integrates topological and sequence information of drug,which complement each other to achieve better characterization.Finally,inspired by the attention mechanism,the spatial importance weight is calculated in the convolution channel of drug sequence,and then the importance weight is transferred to the corresponding nodes of molecular graph in the graph convolution channel,to guide the learning of the structure of drug.Compared with other characterization methods and baseline models,the validity of the proposed method is verified. |