| With the continuous progress of natural language processing tasks and the rise of research fields such as knowledge maps and information retrieval,Chinese natural language processing tasks have achieved rapid development as a basic research in recent years,but there is very little research in the field of ancient literature.The ancient texts in our country are unique to China,and record a lot of ancient information.The study of ancient texts is of great significance to the study of Chinese history and culture.Recognition of named entities in ancient texts can extract a large amount of entity information from ancient texts,which helps researchers quickly master the knowledge of ancient texts and has important research significance."The Romance of the Three Kingdoms" is China’s first full-length novel with the theme of the peasant uprising.It reflects the transformation of various social struggles and contradictions in the era of the Three Kingdoms.It is by far the highest artistic achievement in this type of novel."Dream of Red Mansions" starts with love and marriage tragedy.It deeply reflects the social reality of the late feudal society in China and criticizes the rotten feudal ruling class.It is the pinnacle of classic Chinese novels and an encyclopedia of Chinese feudal society.This paper selects "The Romance of the Three Kingdoms" and "Dream of Red Mansions" as corpus text for manual annotation work,constructs an ancient Chinese named entity recognition data set,and performs the following named entity recognition work:In terms of entity recognition,we have improved the latticeCRF model that performs well on the standard Chinese NER dataset and proposed the latticeLAN model.This model replaces the CRF module with a LAN model based on the attention mechanism,which can well integrate the label information of the text.After experiments and analysis on the ancient text data set and the Chinese NER standard data set,it can be quickly and accurately in a shorter time.Identify the entities in the text.In order to further explore the impact of named entity recognition on downstream tasks,we explored the entity linking task for ancient texts.The entity link is to correspond the identified entity to the entity in the knowledge base,and is the basis of tasks such as relationship extraction and knowledge graph.We randomly selected 131 entities from the entities marked by NER to build a knowledge base,and selected some texts containing entities to do entity link annotation.Then we made improvements on a basic entity recognition and linking joint training model,and verified that our proposed LatticeLAN model can provide more entity information for downstream tasks and improve the overall performance of the entity recognition and linking system. |