Font Size: a A A

Research On Named Entity Recognition Of Chinese Electronic Medical Records

Posted on:2024-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:D D ZhouFull Text:PDF
GTID:2544307055974729Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of hospital information system,a large number of Chinese electronic medical record texts are accumulating day by day and contain more and more rich semantic information.However,a large number of unstructured clinical texts limit the large-scale development and application of electronic medical records,so the most urgent thing is to transform unstructured texts into structured texts that are easy to understand and use.Named entity recognition as a technology of text mining can solve the problem of unstructured text well.Since the existing named entity recognition pre-training model does not consider the phonetic information of Chinese characters,it is difficult to capture enough information contained in Chinese characters.Therefore,this paper proposes a pre-training model Chinese BERT model based on the integration of phonetic information of Chinese characters.As the character representation based named entity recognition method could not use lexical information,an adaptive lexical enhancement based named entity recognition method was proposed.Word vector and word vector were joined together,which could introduce lexical information to the model and enhance the recognition ability of entity boundary.Experiments show that this model can improve the recognition effect of named entities.Aiming at the problems of entity nesting and entity boundary disorder and missing annotation in Chinese electronic medical records named entity recognition task,this paper proposes a decoding method based on global pointer network and a countermeasure training method.Compared with CRF model,which is too strict on sequence labeling classification conditions,a method based on global pointer network is proposed,which can predict nested entities and non-nested entities without difference.The method adopts the strategy of shared parameter matrix to reduce the training parameters and improve the training speed.The adversarial training method can add disturbance to the model and improve the robustness and generalization ability of the model.The effectiveness of global pointer network and adversarial training is verified by experiments,which can improve the entity recognition ability of the model.To sum up,the named entity recognition method proposed in this paper can well solve some problems in the task of named entity recognition of Chinese electronic medical records.
Keywords/Search Tags:electronic medical record, named entity recognition, ChineseBERT, Global pointer network, Vocabulary enhancement
PDF Full Text Request
Related items