Font Size: a A A

Research On Drug-Disease Knowledge Discovery Based On Network Link Prediction And Path Discovery

Posted on:2024-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:J SunFull Text:PDF
GTID:2544307088984179Subject:Information Science
Abstract/Summary:PDF Full Text Request
Objective: To construct a network based on PubMed literature database,applying link prediction and path discovery to achieve knowledge discovery respectively,and taking the field of colorectal cancer as an example to conduct empirical research to verify the feasibility of using the network for drug-disease knowledge discovery.Methods:(1)Using Pubtator Central to abstract the title abstract of literature,retaining three entity types of drug,disease and gene,constructing a drug,disease and gene co-linear network;using g-index screening threshold to screen high frequency word pairs;using Cytoscape to construct a visual network,analyzing network structure and network composition;predicting networks by using weighted local similarity link prediction metrics such as Weighted-Adamic-Adar(WAA),Weighted Commoc Neighbors(WCN),Weighted-Resource Allocation(WRA),Weighted Preferential Attachment(WPA),are used for network prediction.(2)Semantic extraction tool Sem Rep is used to obtain Subject-Predication-Object(SPO)triad,and the results are filtered by using 127 semantic types and 54 semantic relations contained in UMLS semantic network,and 14 semantic types such as disease,gene/protein,chemical drug,cell function,and 14 semantic relations such as TREATS,PREVENTS,ISA,LOCATION_OF,PROCESS_OF are retained;The SPO semantic network was constructed.Using Cytoscape to visualize the network and analyze the network structure and composition;the semantic types and semantic relationships in the SPO semantic network were analyzed to summarize the potential semantic pattern patterns based on biomedical literature;Based on the canonical semantic model,the paths that start at drug-like nodes and end at disease-like nodes in the semantic network are searched for intermediate targets genes/proteins to constitute drug-disease associations,and the 2-hop path discovery method is used for knowledge discovery.Second.The drug-disease knowledge discovery model is validated using the colorectal domain as an example: the semantic network and prediction network are implemented using Cytoscape software network for knowledge storage;analyze the mechanism of action of drugs and summarize the intermediate nodes(i.e.,targets,containing genes or proteins)searched for in the pathway inference process.The drug disease association predicted by the above two methods uses knowledge bases such as Drug Bank and Pharma GKB to verify the accuracy of the prediction network.Results: Firstly,47 new associations were predicted using WAA,WCN,WRA and WPA weighted link predictors,and got 12 new drug disease associations,by analyzing the mechanism and pharmacological effects of the drugs,it was obtained that the Cholate,NSC 659687,drug podophyllotoxin,toremifencan can be used as candidates for the treatment of colorectal cancer.Secondly,Five semantic patterns based on tumors were summarized: tumor treatment,related diseases,disease characteristics,pharmacological effects,and influencing factors;positive relationships AUGMENTS,CAUSES,and STIMULATES in the semantic network were identified,and negative relationships INHIBITS,DISRUPTS for a total of 5 semantic relations,and propose 7 2-hop drug-disease pathway discovery rules,for example: IF Drug X1-INHIBITS-Gene Y1 and Genes Y1-AUGMENTS-Disease Z1 THEN Drug X1-"May TREATS"-Disease Z1;IF Drug X2-DISRUPTS-Gene Y2 and Genes Y2-CAUSES-Disease Z2 THEN Drug X2-"May TREATS"-Disease Z2;predicted1436 new drug-disease associations,validated by the existing knowledge base,the accuracy of the obtained prediction network in which the drug can treat the disease reached 64.8%;summarizing the intermediate nodes(targets,containing genes or proteins)searched for in the pathway inference process,further validating the credibility of the drug-disease prediction network.The predicted therapeutic candidates for colorectal cancer include: Quercetin,oxaliplatin,calcium,vitamin D,genistein,pyruvate,tazarotene,sulphorafan,etc.Conclusion: This study achieves drug-disease knowledge discovery based on network link prediction and path discovery methods.By studying the weights between entities in the network and the semantic relationships among multiple entities,potential semantic patterns in the field of cancer are summarized.New drug-disease associations are predicted using weighted local similarity link prediction algorithm and path discovery method.In the future,multi-source data should be integrated to build a network from multiple perspectives,improve the network-based knowledge discovery methods,and realize knowledge discovery in multiple fields of biomedicine.
Keywords/Search Tags:Drug-Disease Knowledge Discovery, Path Discovery, Link Prediction, Network Analysis
PDF Full Text Request
Related items