De novo drug discovery is an expensive,time-consuming,and high risk process.The emergence and development of the drug computation has provided a tremendous boost to traditional drug discovery,especially the increasing interest in computational application of pharmacogenomics.Pharmacogenomics is essential to reveal the latent relations between drugs and potential indications,pharmacokinetic properties and mechanism of action(MOA).In this paper,we untilized bioinformatics techniques and deep learning methods to predict the anticancer potential and toxicity of drugs based on large scale pharmacogenomics,and the main research contents and innovations are shown as follows:(1)To discovery the issue of anti-cancer drug screening,we proposed a connectivity score method,Anti-Cancer Score of Reversal Potency(ASRP),to measure the drug anti-cancer reversal potency combining WGCNA and CRISPR gene knockdown effects based on gene expression profiles.Considering a large range of regulation among downstream genes in tumor tissues and the variation of apoptosis effects by different genes,ASRP screens highly correlated co-expressed gene clusters from tumor profiles,identifies hub genes,weights them based on the promotion/suppression effects for cell proliferation.The final anti-cancer score of reversal potency is calculated based the treatment-control differential expression profiles.Based on the integrated five available pharmacogenomics with two cancer genomics databases,we obtained ASRPs of nearly 220,000 treatment-control groups,covering 21,389 compounds.ASRP was validated on 6 drug sensitivity indicators and cancer therapeutic evidences.ASRP exhibits a superior performance over 11 connectivity score methods.The validation of hot compounds shows that ASRP could effectively indicate the anti-cancer potential of undiscovered compounds,complemented by MOA interpretation.(2)For pharmacogenomics derived from multi-dose and multi-time experimental conditions,we achieved a deep learning model Cross-RNN to predict drug toxicity.Comparing the conventional recurrent neural network(RNN)that propagates memory across nodes of hidden layers on one-dimension sequences,this approach simultaneously propagates memory along two-dimension sequences,dose and time,using cross-structured hidden layers.Based on a publicly accessible toxicogenomics dataset,CrossRNN uses gene expression profiles of mouse liver tissue to predict the degrees of several drug-induced pathological findings.The contrast experiments show that Cross-RNN exhibits the better prediction performance than many machine learning and deep learning models,including one-dimension RNN and neural networks with Attention mechanism.In multi-dose and multi-time drug experiments,this adapted RNN model can fully exploit the inferential relationships of gene expression patterns among multi-dimension sequence samples. |