| Drug research and development is laborious,time-consuming and expensive.Drug repositioning aims to reduce drug development time,cost and failure.With the development of technology,researchers can generate a large amount of omics data according to actual needs.With data basis,drug repositioning has also entered the era of big data-computational drug repositioning,combining with omics data to develop algorithm model for drug-disease relationship prediction.With further research,we have a deeper understanding of the mechanism of action of drugs.Drugs mainly act on diseases through targets,genes and proteins.This data structure is different from traditional European spatial data.The traditional neural network can better solve the problem of Euclidean spatial data,but has limited role in the study of such nodal data characteristics.Graph neural network is the optimal framework for data analysis of node data at present,and can effectively solve this problem.The main work of this dissertation is to develop drug repositioning model based on graph neural network.The main work is as follows:(1)Organize omics data.Data sets of drug-disease,drug-target,disease-gene,protein-protein and protein-biological function interactions were collected from Drugbank,Gene Ontology and DisGeNET and other omics databases.Pretreatment was carried out for subsequent experiments.Among them,biological function is a kind of important data in bioinformatics research,which has not been used in this field before.Among them,biological function is a kind of important data in bioinformatics research,and it is used for the first time in the field of computing drug repositioning.(2)In order to solve the current feature processing problem of drug and disease data after graph structuring,a heterogeneous network information fusion algorithm based on graph convolution network-HEGCN(Heterogeneous network information fusion algorithm based on graph convolution network)was proposed.Data of drug-disease,drug-target and disease-gene interactions were used as input.First,the drug-drug similarity matrix,disease-disease similarity matrix and drug-disease adjacency matrix were calculated,and then the heterogeneous network between drugs and diseases was built.Then,the heterogeneous network was input into the weighted graph convolutional network for feature extraction,and the adjacency matrix of drug-disease interaction was reconstructed using a linear decoder.HEGCN algorithm mainly uses weighted graph convolutional network to solve the feature processing problem after the current drug and disease data graph is structured.Compared with the current advanced algorithms,the performance evaluation index AUPR of HEGCN algorithm is basically improved by more than 10%,and the features of drug and disease data are extracted effectively after graph structure.(3)In order to solve the problem of multi-information fusion and automatic extraction of important information for drug and disease feature data,a bipartite graph multiinformation fusion algorithm based on graph attention mechanism-BGAT(Bipartite graph multi-information fusion algorithm based on graph attention mechanism)was proposed.Data of drug-disease,drug-target,disease-gene,protein-protein and proteinbiological function interactions were used as input.First,the drug-disease bipartite graph,drug-target bipartite graph,disease-gene bipartite graph,protein-protein isomorphic graph and protein-biological function bipartite graph were built,and then the bipartite graph information fusion module was combined with GAT for feature extraction.Finally,the improved multi-layer perceptron BMLP was used to predict drug-disease link relationship.Compared with the advanced algorithms,the performance of BGAT algorithm is improved by more than 10%,and the BGAT model can accurately find the drug megestrol acetate for breast cancer in drug prediction experiments.It shows that BGAT model can effectively solve the problem of multi-information fusion and automatic extraction of important information of drug and disease characteristic data,and has excellent prediction ability.In summary,after sorting out the new data set,this dissertation proposed the HEGCN algorithm according to the status quo of the field and data characteristics.The reliability of the HEGCN algorithm was successfully verified through the five-fold cross-validation.On the basis of HEGCN algorithm,the BGAT algorithm was proposed to solve the problems of unexplainable feature extraction and difficult optimization of prediction module of HEGCN algorithm.The reliability of the BGAT algorithm was verified by the five-fold cross-validation and the availability of the method was verified by the successful prediction of a breast cancer drug megestrol acetate.The results of this study have guiding significance for the biological experiment of drugs and help to speed up drug development. |