| Drug development is a time-consuming and costly process.In the process of new drug development,in order to develop safe and effective new drugs,there are thousands of compounds that require repeated testing and a large number of clinical trials.Drug development generally requires long steps and supervision such as preclinical research,clinical research,FDA drug review process,and post-marketing safety monitoring.The time dimension of its research and development can be seen.Generally,the average drug development cycle is 10 to 15 years,and the research and development cost is about 800 million US dollars.Generally speaking,90% of new drugs fail to pass the first phase of clinical verification.The high pollution and other problems brought by the research and development process also bring new drug research and development.There are many challenges and ups and downs.With the continuous development of big data technology and the increase of computing power,more and more research focuses on the use of big data to study the field of old drugs and new uses,namely drug relocation.The purpose of its research and development is to achieve acceleration through algorithms such as machine learning.Steps(such as the stage of drug discovery and preclinical research),reduce costs,and enable better drug screening and prediction of molecular properties in the early stage of drug development,which can not only save a lot of research and development costs but also significantly reduce the later stage The workload of the experiment.The main research content of this paper is the research of drug relocation technology based on multi-source heterogeneous drug information network.The specific research content includes:(1)A drug-disease prediction model Graph DDF based on deep learning is proposed.The model first performs data preprocessing operations on the multi-source heterogeneous drug information network through the random walk algorithm,and integrates the node embedding representation of the multi-source drug information network through the Graph CNN model,and combines the collective variational autoencoder as the input data.Reconstruct the embedded representation of the drug node and the known drug-disease relationship to complete the training.It can not only avoid the human error caused by traditional feature extraction,but also combine the initial features generated by the random walk algorithm as input,eliminating the parameter burden caused by the sparse initial vector,and at the same time,combined with the Graph CNN model to better integrate multiple sources.The characteristic information of nodes in the network and the complex nonlinear network topology structure are constructed.The low-dimensional embedding of the network obtained through the final training is used as an effective representation of the node,and the collective variational autoencoder is combined to make the final prediction of drugs and diseases.The experimental results show that in different positive and negative sample scenarios,the Graph DDF model has higher prediction performance than other comparative models,with AUC reaching 93.48% and AUPR reaching 94.82%.At the same time,by calculating the average value of ten experiments,the top 100drug-disease association sequences are provided,and the more potential predictive relationships of the top 100 are explained in combination with Drug Bank and reference literature.Part of the drug-disease relationship has been included in Drug Bank.(2)A drug-target prediction model Graph DT based on graph neural network is proposed.Perform data preprocessing on the nodes in the drug information network by restarting the random walk algorithm.After the feature representation of the preprocessed node is obtained,the isomorphic network and feature representation in the data are used as input,and a variety of graph neural network models are used for training The optimal drug-target prediction model is selected on the basis of minimizing the loss.AUC and AUPR are selected as the evaluation indicators,and the ten-fold cross-validation comparison is performed under the data of ten times negative sampling and full negative sampling.Experiments show that Graph DT predicts the drug-target association relationship in different scenarios,and the evaluation indicators of AUC and AUPR Both are better than other comparison models,among which the AUC value of the Graph DT model is 91.97% and the AUPR value is82.19%.In addition,in order to reduce the experimental error,the average value of the drug-target association relationship prediction score was calculated through ten experiments,and some of the most promising drug-target association relationships were selected for a brief discussion and analysis,and the previous results were provided.100 drug-target predictive relationships,these drug-target relationships have a greater possibility of being applied to drug relocation. |