| With the development of digital humanities,it has gradually become a trend to use computers to assist research in Humanities.In the field of ancient Chinese,literary works such as poetry,prose,opera,novels,and historical works such as "Sanguozhi","Shiji","Hanshu" record Chinese civilization which lasted throughout 5000 years.It’s of great significance for the development of the whole human civilization to make an automatic and methodical research on a large number of ancient books accumulated by the world’s only uninterrupted ancient civilization.With the development of computational linguistics and knowledge graphs,natural language processing technologies for modern Chinese are becoming more mature,and knowledge-assisted semantic understanding of modern Chinese has become a research hotspot.However,the research on ancient Chinese is rare.If the technology of computational linguistics can be used to perform automatic word segmentation,part-of-speech tagging,named entity recognition,and syntactic analysis to achieve grammatical analysis of ancient Chinese,the technology of knowledge graph can be used to achieve semantic analysis of ancient Chinese,then it is possible to study electronic ancient books with a standardized and unified process,thus avoiding a large amount of repetitive manual labor,which assist the research in the field of ancient Chinese from a new perspective.Therefore,this paper studies the technology of natural language processing and knowledge graphs for ancient Chinese,so as to assist the semantic understanding of ancient Chinese corpus.Specifically,it mainly includes the following aspects of work:(1)This paper proposes an efficient two-step new word detection algorithm for ancient Chinese corpus(denoted by AP-LSTM-CRF),which integrates parallel Apriori algorithm and Bi-LSTM-CRF segmentation probability model.It uses association rule of data mining and deep learning method to effectively mine new words in ancient Chinese corpus.The experimental results show that AP-LSTM-CRF is superior to several current mainstream algorithms on Song Poetry dataset and History of the Song Dynasty dataset.(2)This paper proposes a method to construct classical Chinese poetry knowledge graph,and a knowledge graph(denoted by CCP-KG)is obtained using this method which covers every aspect of classical Chinese poetry and contains multi-layer semantic links between words.CCP-KG can be used to analysis classical Chinese poems from the perspective of semantics.In addition,CCP-KG can also be applied to various tasks like reasoning and analysis in classical Chinese poetry.(3)This paper builds a system for data mining of classical Chinese poetry and displaying of CCP-KG The main functions include displaying the results of classical Chinese poetry segmentation,displaying the CCP-KG,analyzing the poetry of each dynasty based on classical Chinese poetry segmentation and CCP-KG and displaying the analysis results in the form of interactive charts,automatically classifying the theme and emotion of the poetry based on CCP-KG and displaying the classification results. |