Font Size: a A A

Predicting Drug-Drug Interactions Based On Multiple Data Sources

Posted on:2018-12-31Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:2334330542487345Subject:Engineering
Abstract/Summary:PDF Full Text Request
Drug-drug interaction(DDI)is defined as any drug effect that greater/less than expected in the presence of another drug.It can be roughly divided into two categories according to their influence of acting in the human body.One of the categories,the harmful interaction,may easily include adverse drug reactions(ADRs)and that is why this project focus on DDI prediction.In the past,ADRs can be easily avoid by experienced doctors.However,according to the development of the pharmaceutical industry,the number of medications are increasing quickly and nobody can thoroughly clear all DDIs of these drugs.To solve this problem,our project present a large-scale DDI prediction to handle all kinds of DDIs by a computational framework.Firstly,our project accomplished the extraction of data from several data sources and adjusted the format,including chemical structure,ATC code,GO,pathway,indication,target sequence,side effect and enzyme.Secondly,our project computed the similarity between drugs with cosine similarity and Jaccard similarity after obtained the feature data which provided by the first step of our project.In this process,in consideration of their potential attribute,SMILES data need to be convert into Hash code by BFS algorithm.GO similarity should be finished by Resnik algorithm.Target sequence similarity are calculated by Smith-Waterman algorithm.Thirdly,we choose LR model to set up the PK predicting model and PD predicting model.Those positive training data comes from Drugbank and we finally get a data set with a number of 290 thousands.These data is then divided into PK interactions and PD interactions by Na?ve Bayesian model.A random drug pair data set constituted the negative set,but the drug pairs which are included in the positive set need to be excluded.Those data which both the drug pairs have the same ATC codes,which have no structure data or ATC code and which the drug pairs are not be used in the clinical treatment should also be exclude.After finishing the training,we obtained a high specificity and sensitivity levels(AUC>0.9).The classification also get a high F-measure(above 0.7).According to these performance,we confirm that this classification have an ideal property.Fourthly,the project also design some experiments to confirm the performance of the classifier.According these experiments,we found that the bias of data cannot influence the performance of the classification notably.These experiments also proved that a single feature cannot get an ideal classifier and a loss of one feature cannot decrease the AUC notably.There is also an experiment about the source of the data which finally confirmed that multiple sources can take a better property than single source.Finally,the project use the classifier to predict new DDIs and compare the outcome with the list of DDIs come from FAERS database and get a 30~40% overlap.The project also take an experiment accorded a hypothesis about PD,and finally proved that the prediction is credible because of the large valueof.Additionally,the article also analyzed the performance of the classifier by a drug pair which is achieved by the prediction and confirmed that the classifier has a good property once again.
Keywords/Search Tags:Drug-Drug Interaction, Pattern Recognition, Logistic Regression, Drug Similarity
PDF Full Text Request
Related items