Research On Medical Text Named Entity Recognition And Entity Relation Extraction Based On Machine Reading Comprehension Framework

Posted on:2022-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:X Tao

Full Text:PDF

GTID:2494306332474104

Subject:Computer Software and Application of Computer

Abstract/Summary:

PDF Full Text Request

As the carrier of patients’ clinical information,electronic medical records record a large amount of patient’s clinical information.The clinical data recorded in electronic medical records can provide a reference and basis for subsequent diagnosis,treatment,and research.Since electronic medical records are unstructured or semi-structured texts stored in natural language,this greatly limits the effective use of electronic medical records.Therefore,the research on information extraction of clinical medical text is of great significance.Using natural language processing technology to extract useful information from clinical medical texts is an effective way to improve the utilization efficiency of electronic medical records.This thesis researches named entity recognition and entity relation extraction tasks in clinical medical text information extraction,mainly extracting medical entities and entity relation contained in the text.The work of this thesis is of great value to related tasks such as automatic question answering,knowledge graph,information retrieval in the medical field.In the previous methods,the sequence labeling model is the main method,which has two problems: the lack of external knowledge of the model and the nesting of entities.This thesis proposes a method to transform the task of named entity recognition and entity relation extraction into the task of machine reading comprehension,which takes advantage of the similarity between the task of span extraction type and the task of information extraction.Upstream and downstream task patterns are adopted to construct deep learning models for the two tasks respectively.Through the framework of machine reading comprehension,the task-related prior knowledge is integrated into the model by manual customization problem,which is the main improvement of this thesis.This approach improves the problem that the sequential annotation model can only be modeled based on text information and can not take advantage of external knowledge.Besides,to deal with the entity nesting problem,the answer prediction module of the named entity recognition model uses the boundary model to decode the location of the entity mention.In the embedded module of the entity relation extraction model,entity category tags and cross-sentence information are added to enhance the entity relation extraction capability of the model.The proposed model was tested on two clinical data sets.Compared with the sequence labeling model,the F1 scores of the named entity recognition model in this thesis on CANTEMIST and N2C2 datasets are improved by 14 and 12 percentage points respectively.In the N2C2 data set,the F1 score of the entity relation extraction model is improved by 7 percentage points compared with that of the Bert model.

Keywords/Search Tags:

Named Entity Recognition, Entity Relation Extraction, Machine Reading Comprehension, Bert

PDF Full Text Request

Related items

1	Research On Named Entity Recognition And Relation Extraction For Medical Texts
2	Research On Named Entity Recognition And Entity Relationship Extraction Of Medical Data Text Based On Attention
3	Research On Named Entity Recognition In TCM Medical Records Based On BERT Pre-training Mode
4	Research On Chinese Named Entity Recognition In Medical Field
5	Research On Method Of Medical Named Entity Recognition Based On Pre-trained Model
6	Research And Implementation Of Medical Entity Recognition System Based On Double BiLSTM
7	Research On Named Entity Recognition Of Biological Pathogens Based On Neural Networks
8	Research On Biomedical Named Entity Recognition And Relation Extraction Based On Neural Network
9	Research On Named Entity Recognition Of Xinjiang Local Medicine Based On Pre-training Model
10	GAN-based Named Entity Recognition For TCM Text