Font Size: a A A

Named Entity Recognition And Relationship Extraction Based On Biomedical Domain Knowledge Enhancement

Posted on:2023-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:G YangFull Text:PDF
GTID:2530307031950639Subject:Engineering
Abstract/Summary:PDF Full Text Request
All kinds of medical literature,including clinical records,paper studies,etc.,have seen a sharp increase in data.Take the Pub Med database as an example,the number of articles included in it increases exponentially every year.Researchers need to keep up with the research trends in their own field.More and more physiological and genetic data of patients are available to clinicians.How to quickly obtain structured and easy to query and related data from unstructured texts has become a hot and difficult point in life science research and precision medicine.There is a long history of using natural language processing to process biomedical literature.However,there are still many problems in the current research of entity recognition and relation extraction in the biomedical field :(1)lack of connected knowledge base of multientity relation types;(2)Biomedical terms have repeated and ambiguous abbreviations,and lack of prior knowledge to identify some entity pairs with no obvious semantic relationship;(3)Lack of one-stop biomedical knowledge mining platform.Based on the above problems,this paper makes an in-depth study from the aspects of constructing knowledge graphs and using the knowledge graph for pretraining of knowledge enhancement.The main work and contributions of this paper are summarized as follows:To solve the problem of lack of connected multi-entity relationship type knowledge graph,the main work in this paper is to integrate and normalize 17 highquality life science databases.And an entity alignment model MAGNN based on the characteristics of the biomedical knowledge graph is proposed,and the effectiveness is verified by the entity alignment dataset constructed in this paper.Finally,a knowledge graph containing 8 kinds of entity nodes and 17 kinds of interaction relationship edges is obtained,which contains a total of 108047 nodes and 4469414 edges.The entity types include genes,phenotypes,diseases,drugs,pathways,cellular composition,biochemical reactions,and molecular functions.There are some problems with repeated and ambiguous abbreviations of biomedical terms without obvious semantic relations.This work focuses on using knowledge enhancement to improve the context encoding ability of pre-trained language models,so as to improve the effect of named entity recognition and relation extraction.In this paper,a method called BMKG-BERT is proposed to generate subgraphs from the knowledge graph based on the pre-trained corpus and inject graph context into the pre-training.Experiments demonstrate that the BMKG-BERT model outperforms all current baseline models on all eight datasets in named entity tasks.In relation extraction task,the results on all three datasets exceed all current benchmark models.It proves the great potential of knowledge graph-enhanced language models in named entity recognition and relation extraction.This paper designs and implements a biomedical literature structuring system for the lack of a one-stop biomedical knowledge mining platform.The biomedical knowledge graph constructed in this paper and the proposed knowledge-augmented pre-trained language model are applied to the system.Such a platform that integrates multiple life science databases and updates biomedical literature in real-time provides references for biomedical scientific research and is the cornerstone of clinical precision medicine.In summary,this paper addresses the problems in knowledge mining in the biomedical field,explores research in the direction of building knowledge graphs and pre-trained language models for knowledge enhancement,and validates the effectiveness of this paper’s methods on several publicly available datasets.
Keywords/Search Tags:Biomedicine, Named entity recognition, Relation extraction, Knowledge graph, Pre-training
PDF Full Text Request
Related items