Construction Of Gastric Cancer Knowledge Graph And Drug Discovery Application

Posted on:2024-03-03

Degree:Master

Type:Thesis

Country:China

Candidate:Y W Lu

Full Text:PDF

GTID:2544306941963679

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Gastric cancer has a high incidence and mortality rate,which is particularly prominent in China.To fully utilize the expanding medical literature,this thesis systematically investigates how to extract biomedical knowledge automatically from literature and related databases and construct a gastric cancer knowledge graph,named GCKG,which can be used for drug discovery applications.The main contents include:(1)According to the characteristics of biomedical texts,models for entity recognition,entity normalization,and relation classification were designed respectively,based on which GCKG was constructed.Due to the long and irregular naming of medical entities,the entity recognition model applied the Bidirectional Gated Recurrent Unit and Interactive Pointer Network,and focused on identifying entity boundaries,thus improved the accuracy of entity recognition,with average F1 value on 8 entity recognition datasets reached 84.5%.For entity normalization task,a model based on Term Frequency-Inverse Document Frequency and Gated Attention Unit is proposed.It combines the semantic features and characteristic features of entities and the average Hits@1 of the model on 5 entity normalization datasets reached 95%.Aiming at the complexity of medical knowledge expressed in text,the relation classification model integrates cross-text features,entity features,and context features to more accurately predict semantic relationships.The average F1 value of the model on 11 relation classification datasets is 86.9%.Meanwhile,a multi-task learning method based on hard parameter sharing is used in those models,which can effectively improve the model performance and calculation speed.The final GCKG defines 5 entity types and 5 relationship categories,including 9129 entities and 88482 triples.(2)Drug discovery was studied based on built GCKG.Firstly,a biomedical knowledge embedding pre-trained language model called BioKGE-BERT was constructed to transform the knowledge graph to knowledge embedding vector.Then a drug-disease discriminant model was built based on CNN-BiLSTM,using knowledge embedding vector to predict whether the drug could treat gastric cancer.The final result shows that,9 out of the top 10 predicted drugs have been reported to be useful in the treatment of gastric cancer,which can well validate the medical value of GCKG.(3)An online platform for GCKG was developed to assist the research of disease mechanisms.The platform consists both a subsystem for biomedical knowledge extraction and a subsystem for gastric cancer knowledge graph retrieval.The former is based on the constructed biomedical knowledge extraction models,providing general entity recognition,entity normalization,and relation extraction functions.The latter is used to retrieve specific knowledge from the GCKG and to visualize the results.

Keywords/Search Tags:

Knowledge Graph, Drug Discovery, Knowledge Extraction, Pretrained Language Model, Knowledge Emedding

PDF Full Text Request

Related items

1	Design And Implementation Of Intelligent Medical Assistant Based On Knowledge Graph
2	Knowledge Graph Construction And Knowledge Discovery Of Alzheimer’s Disease
3	Construction Of Ethnomedicine Knowledge Graph Based On BERT-BiLSTM-CRF Knowledge Extraction Model
4	Research On The Construction,Knowledge Mining And Application Of Infertility Knowledge Graph In Ancient Books Of Traditional Chinese Medicine
5	Knowledge Mining And Knowledge Discovering For Biomedical Text And Graph
6	Research On The Construction Of Traditional Chinese Medicine Diagnosis And Treatment Knowledge Graph And Knowledge Discovery Of Chronic Gastritis Based On Clinical Medical Records
7	Design And Implementation Of Medical Guidance Question Answering System Based On Knowledge Map
8	Construction And Analysis Of Medical Knowledge Graph Of Lung Cancer
9	Research On The Construction Method Of Chinese Medical Knowledge Graph Based On Multi Resources
10	Design And Implementation Of Text-based Medical Knowledge Graph