With the development of medicine and the continuous growth of medical data,how to extract corresponding medical knowledge from a large number of unstructured Chinese electronic medical records has become a hot research topic.As an important technology for structured medical text,medical named entity recognition and event extraction can identify and extract entities and events in electronic medical records,laying the foundation for the next step of building the medical knowledge graph and supporting medical intelligent decision-making.This paper mainly studies medical named entity recognition and event extraction technology.The main research results are as follows:Firstly,a medical named entity recognition method based on BERT pre-training language model is proposed.In this model,the BERT can model the context semantics of electronic medical records,the IDCNN can perform more accurate convolutional coding of local medical entity information,the multi-head attention can increase the weight of the associated characters by calculating the attention probability of each character and all characters in the medical text,and the CRF layer decodes the optimal medical text sequence.Experiments show that this proposed method can achieve better extraction results on the CCKS2019 medical named entity recognition dataset.Secondly,a medical named entity recognition method based on deep learning multi-model fusion is proposed.In this method,a weighted voting algorithm based on coefficient of variation is first proposed to build a fusion model based on BERT,IDCNN and GAT.The BERT can obtain the contextual semantic representation of the electronic medical record,the IDCNN can efficiently extract semantic information from electronic medical records,and the GAT can make full use of word boundaries and semantic information by constructing three word and character graphs,and an entity error correction algorithm based on historical information is proposed to optimize the fusion results.Experiments show that the precision,recall and F1 value of this proposed method reach 89.56%,82.77%and 86.03% respectively.Thirdly,a semi-supervised medical event extraction method based on pseudo-label confidence selection is proposed.This method constructs a medical event joint extraction model based on Transformer encoder,Bi LSTM and attention mechanism,and a pseudo-label confidence selection algorithm is proposed for selecting high-confidence data.By calculating the coincidence probability of pseudo-labels,high-confidence pseudo-label data is selected and used to expand the data to update the medical event joint extraction model.we use the updated model to extract the tumor primary sites,focus sizes,and metastatic sites events in the electronic medical record,and use majority voting to optimize the final extraction results.Experiments show that this proposed method has achieved excellent performance on the CCKS2020 medical event extraction and dataset. |