| Drugs are a kind of biomedical entity that has been extensively studied,and a large number of them are used in clinical treatments.Drugs can not only cure diseases,but also produce some adverse reactions,which can damage the patient’s body.In severe cases,other diseases may be induced.Therefore,it is important to study the interrelationships between drugs and phenotypes,such as symptoms and diseases.Today,the biomedical literature is the latest and most comprehensive source of drug knowledge.There are still the following challenges in obtaining and using valuable drug knowledge that is clinically needed from the massive biomedical literature:(1)Biomedical literature is recorded in unstructured text form,and manual processing and extraction of relevant knowledge are time-consuming and laborious;(2)Existing drug information extraction research mainly focuses on the interaction relationship between drugs and drugs,but there are few studies on the relationship between drugs and phenotypes;(3)How to make the extracted data The use of clinical auxiliary clinics is also a problem that needs to be solved urgently.Based on this,the main research content of this article includes the following three aspects:(1)This paper research on joint extraction of drug entity relationship based on deep learning.Based on the Semmed database,this article first screened out the entities and relationships related to the drug phenotype to construct a standard data set of drug phenotype entity relationship extraction,and then manually reviewed and repaired the annotation problems in the data set,and finally formed a drug phenotypic relation extraction corpus that containing 21751 relation data.Based on this relationship extraction data set and NYT,DDI,CPI data sets,the joint relationship extraction model based on BioBERT+BILSTM and the pipeline relationship extraction model based on Bio BERT+BILSTM are used respectively.Among them,the joint extraction model proposed in this paper achieves high F1 scores in each data set(Semmed: 73.80%,NYT:75.35%,DDI: 69.62%,CPI: 37.23%),and achieves the extraction of entity categories,which solves the shortcoming that some joint learning methods cannot extract entity categories.In contrast,the relational extraction F1 score of the pipeline model is lower than that of the joint extraction model,which shows the effectiveness of the decomposition strategy.At the same time,in the experimental results of the pipeline model,the recall rate is generally higher than the precision rate,which also verifies the redundant entity problem in the pipeline learning.(2)This paper research on the construction of a knowledge map of drug phenotypes.Based on the extracted drug phenotype relationship data of Semmed,this paper combines Open FDA adverse reaction data and Drug Bank drug data to construct a knowledge graph of drug phenotypes.In view of the "multiple words and one meaning" problem in different data sources,the dictionary method is used to align entities.The knowledge graph before alignment contains 229,608 entities and 3,756,234 relationships,and the knowledge graph after alignment contains 185,584 entities and3,421,286 relationships.(3)This paper research on question-and-answer application that based on the knowledge graph of drug phenotype.In this paper,a template-based intelligent question and answer system is built based on the knowledge graph of drug phenotypes.The question answering system completely uses the rule method to convert user questions into Cypher graph query sentences,and supports a total of 21 question and answer types.This template-based method has the advantages of strong interpretability,easy implementation and no need to label training data,and is suitable for the construction of domain knowledge base question answering systems. |