Data Mining In Drug Reposition

Posted on:2016-07-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:C H Wang

Full Text:PDF

GTID:1314330512486005

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Computational drug reposition follows two basic strategies:1)for some diseases,the pathogenesis have been understood and the key targets have been uncovered.Drug reposition lies in mining an approved drug to activate or inhibit the targets.Conse-quently,we can resort to predict drug-target interactions.2)for other diseases,both pathogenesis and the key target remain unclear.Hence drug reposition requires inte-grating a variety of information to improve the accuracy of drug-disease interactions prediction.In this paper,we follow the two basic strategies to study drug-target inter-action prediction and drug-disease interaction prediction to implement drug reposition.Currently,many computer scientists and biologists dedicated to drug reposition.However,the focuses are different.The computer scientists are more concerned about the prediction accuracy,the time and space complexity,the convergence and scalability of the prediction algorithms.Therefore,their models outperformed others in training efficiency,prediction accuracy and cost.However,their models confuse the biologists and seldom have been adopted in experiments.Biologists,on the other hand,pursuit of simple,direct and biological interpretable approach to reposition approved drugs.Thus their models usually are costly,time-consuming,but the results are straightforward and more easily adopted by biological researchers.This article aims to reconcile those contradiction,combining with the advantages of both,to develop algorithms not only training efficiency,high accuracy and low cost,but also simple,direct and biological interpretable.The main work is as follows:Ligand/target feature extraction.rTraditional drug-target interaction prediction methods are based on two assumptions:a target might interact with those compounds which are similar with its known ligands;while a ligand might interact with those proteins which are similar with its known targets.Based on the assumptions,it is unnecessary to described the targets and ligands directly,and some kind of similarity measures are enough to achieve good ligand-target interaction prediction performance.Although the similarity assumptions could make a good prediction,the mechanisms of ligand-target interactions remain mysterious,and it is not conducive to be adopted in biological experiments.In addition,a target usually is much larger than a ligand and the ligand only interacts with a small portion of the target,the local structure of a target might be more predictability.In this article,we directly describe target and ligand for better biological interpretable.We extract the binding site form a target,and define target fragments dictionary,based on which the binding site is breakdown into fragments.Meanwhile,we define a ligand dictionary.According to ligand dictionary,a ligand is decomposed into fragments.Both the binding site and ligand fragments are regarded as features.To evaluate the extracted target and ligand features and to explore the importance of the global and local information of a target,we construct a simple classifier and monitor its prediction accuracy.Traditional classification algorithms fail to process double input data structure.To address the issue,we introduce Kron kernel into support vector machine(SVM),so that it can accept double input data structure.With the kernel trick,we combine the global and local information of a target and analyze importance between global and local by different the combination weights.The results show that feature extraction method proposed is effective.The local information of the target dominant the ligand-target interaction prediction and the global information is negligible.The multi-field interaction models for ligand-target interaction prediction.In-spired by the physics matter interaction,we propose the multi-field interaction model to capture the fragment interaction mechanism.Suppose there are a variety of fields around molecular fragments,such as electrostatic,hydrophobic,etc.Interaction be-tween molecular fragments are mediated by a variety of fields.In this article,we first learn two mapping matrices,whose elements represent the field intensity of molecular fragments.Based on the mapping matrices,the molecule fragments are mapped into the field space.The homogeneous field interact with each other to form pairwise field space,in which a classifier is constructed to predict ligand-target interaction.To learn the mapping matrices,we introduce a new sparse regularization for singular value de-composition.The algorithm can easily control sparseness of singular vectors and runs 8-10 times faster than traditional algorithms.Moreover,we can derive a fragment in-teraction matrix from the multi-field interaction model,whose elements represent the interaction intensity of molecular fragments.After analyzing the stabile and signifi-cant values in the fragment interaction matrix,we find their corresponding molecular fragment might interact chemically,i.e.molecule fragments interaction are chemically interpretable.Hence,this fragments interaction matrix can be used to optimize lead compounds and study drugs-target interaction mechanism.Drugs-disease interaction model.After collecting structural,genetic,disease phe-notype and drug-disease interaction information,we first integrate them into three layers and calculate the similarity scores in each,layer.Then,the similarity scores in each layer are combined to generate comprehensive similarity matrices for both drugs and diseases.Next,we construct a network whose vertexes are drug-disease pairs and apply the Learning with global and Local Consistent(LLGC)algorithm to predict drug-disease interactions.To achieve a better interpretable model,we calculate a priori for each unknown drug-disease pair and modify the traditional LLGC algorithm so that it can refer the priori to predict.In addition,the traditional LLGC algorithm can not be extended to large scale,we provides a low-memory calculation solutions to address the problem.The results showed that,modified LLGC algorithm can predict drug-disease interactions well and the introduce of priori makes the model more understandable.

Keywords/Search Tags:

Drug Reposition, Target-Ligand Interaction, Drug-Disease Interactions, Pairwise SVM, Sparse Matrix Decomposition, LLGC Algorithm, Data Mining

PDF Full Text Request

Related items

1	Study On Predictions Of Drug-target Interactions And Drug Combinations
2	The Construction Of Disease Protein-ligand Database And Theprediction Of Drug-target Interactions
3	Research On Drug Mining Method Based On Integrated Matrix Factorization Algorithm
4	Prediction Of Drug-Target Interactions Based On Matrix Completion
5	Machine Learning Approaches For Drug-Target Interaction Prediction
6	Research On Data Mining And Prediction Of Drug Interactions
7	Data Mining Method Research For Gene Screening Of Breast Cancer And Drug Repositioning
8	Research On Prediction Method Of Drug-Target Interaction Based On Network Model
9	Predicting Drug-Drug Interactions Based On Multiple Data Sources
10	Prediction Of Drug-Target Interactions Via Feature Selection