| In recent years,the development of new computational methods to accelerate drug discovery and increase success rates has become an important research focus in the field of drug development.Previous studies have widely used deep learning methods for modeling and prediction on multiple core topics in drug development,achieving competitive performances.However,in drug virtual screening,there is currently a lack of large-scale benchmark datasets that include molecular 3D conformation information,and most deep learning-based methods ignore the contribution of molecular 3D conformation information to molecular representation and bioactivity prediction.In drug repositioning,current deep learning-based methods have not considered the contribution of biological relationships to the drug and disease representations and the drug-disease association predictions.Furthermore,these models lack interpretability,thereby limiting their application in practical drug repositioning.Additionally,an integrated drug discovery prototype system based on deep learning for drug virtual screening and repositioning has not been designed and constructed.To address these issues and challenges,in terms of data resource integration,method innovation and system design,this study constructed a series of drug virtual screening and repositioning benchmark datasets,models,and prototype system.The main research content of this study is as follows:1.Designing drug virtual screening benchmark dataset and method.Collecting and integrating molecular bioactivity data from multiple public drug datasets to construct a large-scale drug virtual screening benchmark dataset containing over two thousand targets and near ten million samples.Then proposing a drug virtual screening method based on the equivalent graph neural network.This method acquires molecular highorder representations from molecular 2D topology and 3D structure information,and uses a deep multiple instance learning method for multi-scale biological activity prediction.In addition,this study also designed an uncertainty estimation module and two interpretability modules,thereby achieving prediction confidence estimation,optimal conformation discovery,and core substructure discovery based on this method.2.Designing drug repositioning benchmark datasets and methods.Collecting biological association information from multiple public biomedical databases,and integrating them to acquire drug repositioning benchmark datasets with multiple biomedical entities and associations.Then proposing two drug repositioning methods based on heterogeneous graph neural networks.The first method constructs a biomedical network by introducing multiple biological relationships,and uses methods such as topological subnetwork embedding,graph attention,and layer attention to learn the representation of drugs and diseases,thereby being used for drug-disease association prediction and drug repositioning.The second method uses deep multiple instance learning based on meta-paths combined with heterogeneous graph neural networks to predict drug-disease associations.The method designs pseudo meta-path generation modules,bidirectional instance encoding modules,and multi-scale interpretable predictors to achieve interpretable drug repositioning based on paths.3.Designing an integrated prototype system based on deep learning for drug virtual screening and repositioning,achieving drug screening and discovery based on targets and diseases,and has the potential for practical application.Overall,this study proposed a set of datasets,methods,and prototype system based on deep learning for drug virtual screening and repositioning.The relevant research results provide a solution to the deficiency of lacking molecular 3D information and biological relation information from the data aspect,and support current drug virtual screening and repositioning modeling and method evaluation research.From the methodological aspect,the study verifies the contribution of molecular 3D information,biological relationships,and drug-disease meta-paths to the construction of drug virtual screening and repositioning methods,and further improves the performance of current prediction models.From the application aspect,it enhances the accessibility,interpretability,and usability of current deep learning-based drug virtual screening and repositioning prediction tools.The overall research provide reference for the data resource construction,method design,and practical application of drug virtual screening and repositioning,and provide multi-dimensional support for efficient drug development processes. |