| Drug combination synergistic therapy has the effect of enhancing drug efficacy and prolonging drug action time in complex diseases such as cancer,human immunodeficiency virus and cardiovascular disease.However,there is huge scope for drug combinations in various types of cancer.Not only is the screening difficult,the experimental cost is high,and even the drug combination will produce antagonistic effects.Therefore,potential synergistic drug combinations that effectively identify specific cancers can maximize synergistic benefits while reducing toxic side effects.In addition,in the stage of drug treatment,the biochemical properties of drugs participate in the regulation of various physiological functions in vivo,which can affect the absorption,target and cytotoxicity of drugs.Therefore,accurate prediction of drug-related biochemical properties will effectively improve the efficacy of drug treatment and reduce the risk of drug toxicity and side effects.This thesis uses graph neural network algorithm and self-supervised learning to predict drug combination synergy and its properties.The main research contents of this thesis are as follows:1.An attention-based graph neural network deep learning model(Deep DDS)is proposed to identify drug combinations that can effectively inhibit the viability of specific cancer cells.First,using a graph neural network and a multilayer perceptron,latent features were extracted from the topological structure of the drug and the m RNA expression data of the original landmark gene of the cancer cell line,respectively.Then,the drug and cancer cell line features were then concatenated and used as input to a multilayer feed-forward neural network to identify drug combination synergies.Finally,this study conducts experiments with leave-one-out method on benchmark datasets,and the results show that Deep DDS achieves better performance than classical machine learning methods and other deep learning-based methods.In addition,on an independent test set released by Astra Zeneca,a well-known pharmaceutical company,Deep DDS outperforms other competing methods with a prediction accuracy of over 16%.This study also explores the interpretability of multi-head graph attention networks,revealing important chemical substructures of drugs from correlation matrix analysis of atomic features.Finally,the prediction model was applied to FDA-approved small molecule drug combinations,and the combination with the highest prediction score was subjected to wet experiments to verify the reliability of the model.2.Due to the limitations of biochemical experiments,the amount of labeled drug data is scarce and precious relative to the huge drug space,so it is not suitable for direct model training.In contrast,a large number of unlabeled drugs contain rich information on the potential properties of drugs..This study proposes a self-supervised adaptive reinforcement contrast learning drug property prediction model(ARCL)for large-scale drug datasets to learn latent representations of drugs and apply it to various downstream tasks.First,in this study,an adaptive enhancement module based on the self-attention mechanism is used for the generation of new views in the contrastive learning task,that is,the new views corresponding to the original samples are automatically generated after the original sample views are input into the view enhancement module.Then,after embedding the views,this study employs a transformer encoder adapted to sequence data to extract latent vectors.Finally,according to the contrastive loss function,the view embedding and encoder are trained by distinguishing positive and negative sample pairs during training,and they are used for downstream tasks.In different downstream tasks,the view embedding and encoder need to be fine-tuned for optimal performance.Experimental results on downstream tasks show that our method achieves the best performance compared to current state-of-the-art unsupervised models in most benchmark datasets.At the same time,this study also visualizes the clustering effect of the drug data in the downstream task dataset after embedding and encoding. |