Font Size: a A A

Chinese Nested Named Entity Recognition Based On Sub-optimal Paths

Posted on:2024-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhuFull Text:PDF
GTID:2568307106967789Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Nested named entity recognition,a subtask of the named entity recognition task,is mainly aimed at identifying named entities with nested structures in text.Most researchers have favoured the use of sequence models that conform to the characteristics of text to solve the problem of nested named entities,but the relatively sparse dataset of Chinese nested text,coupled with the characteristics of Chinese language itself in terms of language structure and the lack of obvious boundaries of Chinese entities in text,have greatly increased the difficulty of Chinese nested named entity recognition.How to enhance the power of the sequence model model to focus on the positional relationships between Chinese characters and words,determine the actual boundaries of multi-level cascading Chinese entities,and allow the model to learn text features with limited datasets and annotations are hot issues that many scholars are trying to solve.This paper examines two aspects of the above issues:(1)A nested named entity recognition method based on location embedding and sub-optimal paths for multi-level result boundary prediction is proposed.Firstly,the absolute position sequence is generated by encoding the position information of the nested entities together with the text position information in the embedding layer,which further explores the relationship between the nested entities and characters and enhances the connection between the nested entities and the original text by focusing on the position information in the Chinese text itself;then the nested entities are initially identified by using the hidden matrix of excluded optimal paths with multi-level prediction;finally,in the Finally,the offsets of entity boundaries are calculated at the multi-level prediction layer to redefine the entity boundaries and thus improve the accuracy of Chinese entity prediction.(2)A nested named entity recognition method based on graph attention network fused with external knowledge is proposed.The initial candidate nested entities are obtained from the original text using jieba disambiguation,and the candidate nested entities are expanded using the knowledge graph to obtain more entity nodes;then the span set of nested entities and their corresponding relationships are used to generate embedding representations through the language model,which are used as the node and relationship inputs of the graph attention neural network to obtain entity relationship embeddings respectively;finally,the entity embeddings fused with external knowledge Finally,the entity embeddings incorporating external knowledge are fused with character embeddings to enhance the model’s ability to mine deep textual information.In this paper,the experimental model is compared with common sequence models,graph structure models and Span enumeration models to verify the effectiveness of the model on both medical and everyday domain datasets.The final results show that the model in this paper outperforms the selected baseline model on both domain datasets.Finally,the analysis and comparison of the results of sample texts on different datasets also further validate the high performance of the proposed model in recognising Chinese nested named entities and mining deeper features of texts.
Keywords/Search Tags:nested named entity recognition, knowledge graphs, sequence models, graph attention neural networks, sub-optimal paths
PDF Full Text Request
Related items