| The concentration of the drug in the tissue is the main factor that determines the efficacy and side effects of the drug,and the transporter distributed on the cell membrane of each tissue plays a very important role in the way the drug enters the tissue.Meanwhile,transporters have been related closely in cancer,metabolic diseases,and neurological diseases.However,the relationship between drugs and transporters is asymmetric,such as one-to-many and many-to-one.And it is difficult to perform large-scale in vitro experiments related to transporters.Therefore,the identification of drug transporters and the generation of potential new drug small molecules based on transporters are important and challenging scientific problems.With the development of biological big data,artificial intelligence methods have played an important role in biological research and drug development.Among them,knowledge graph(KG)is a method that can integrate multiple data sources instead of only considering a single data source,and there is no knowledge graph report on transporter research.In this study,the data on 423 transporters and their related diseases and drugs were collected from databases such as VARIDT,which were preprocessed to construct a transporter-based knowledge graph,which contained 20,137 nodes and 527,888 edges in total.Then,the five commonly used KG embedding methods are applied to the knowledge graph.And the heterogeneous information generated by the model with the best embedding result is applied to the following two research.First,this study built a predictive model of drug transporters,which can be used to predict the transporter types of small drug molecules.The three machine learning models(Logistic Regression,SVM,Random Forest)and the deep learning model(Auto Int)are mainly selected for comparison,and the model with the best result is used as the final predictive model.The results show that the Auto Int model has the best performance(mean AUC is 0.91 and mean PROC is 0.91 in balanced datasets,mean AUC is 0.94 and mean PROC is 0.78 in unbalanced datasets).To verify the real reliability of the predictive model,the transporter of the natural product Luteolin was predicted,and the results showed that the prediction model had good predictive ability.Secondly,because most transporters are membrane proteins,the problem of difficult structural analysis makes it difficult to design small drug molecules based on transporter structures.Therefore,this study built generative models based on transporter sequences.To validate the performance of the generative model,new molecules were generated for three targeted transporters(Uniprot id is P11166,Q01650 and P08183,respectively).The analysis results show that the generative model built in this study can generate novel and validity small molecules using only the transporter sequence features and KG embedding features,and the generated small molecules can form important interactions with key amino acid sites in the transporter pocket.In summary,the transporter predictive model built in this study can be applied to predict the potential transporters of drugs to understand the pharmacokinetics of drugs in advance after entering the human body.The generative model based on transporter sequences can generate novel,effective and small molecules that bind to important amino acid sites in the active pocket of transporters,which provide guidance for the development of transporter-related drugs. |