Research On Automatic Recognition Of The Syntactic Structure Of The Referrer In Classical Books Based On Deep Learning

Posted on:2021-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:M C Zuo

Full Text:PDF

GTID:2518306608461634

Subject:Master of Library and Information

Abstract/Summary:

PDF Full Text Request

Ancient Chinese classics are the supporter and witness of the culture and history of the Chinese nation.Through classics,we can restore and understand the appearance of ancient Chinese society.Therefore,there are rich treasures in ancient Chinese classics waiting to be excavated by researchers.With the development of science and technology,people begin to use computers to process natural language.Although Chinese information processing develops rapidly,there are especially fruitful achievements in the modern Chinese text,but the research on the ancient text information processing is less,this is a sharp contrast with the achievements of modern Chinese text.At present,the research on ancient Chinese text information processing in China is lagging behind,and most of the research focuses on digitization of classical books,automatic word segmentation,part of speech tagging and so on,but little research on the syntactic level of ancient Chinese text.To mining the knowledge in classical books,we should first try the syntactic parsing of classical books.Syntactic parsing is the analysis of the grammatical functions of words in sentences,which can be divided into syntactic structure analysis and dependency analysis.However,it is difficult to realize complete syntactic parsing,which can be used to deconstruct the composition of sentences.Automatic identification of prepositional object structures is a part of shallow parsing.Grammar of ancient Chinese and modern Chinese grammar have lots of difference,this is also one of the reasons why the ancient Chinese classics arcane,but the ancient Chinese and modern Chinese on the prepositional object structure have great similarities,prepositional object structure can raise the time,place,characters and reason,purpose,methods and so on,thus prepositional object structure identification and the analysis of the structure is directly related to people’s understanding of the sentence.If we can realize the identification of the prepositional object structure of ancient Chinese classics,it will be of great help to people’s understanding of ancient Chinese classics.This paper studies the automatic recognition of the structure of reference objects in classical books by means of deep learning.Two corpus are used in this paper,one is Tsinghua tree bank corpus,the other is the Records of History.This paper firstly makes a statistical comparison of prepositions in the two kinds of corpus and analyzes the combination of parts of speech in the structure of the prepositions in the Tsinghua tree bank.According to the statistics of prepositions,it is found that the preposition distribution and diction of prepositions in modern Chinese are similar to those in ancient Chinese.At the same time,some problems such as object preposition and ellipsis of object in the structure of prepositional objects in ancient Chinese are put forward.In order to construct the corpus of classical books,by means of conditional random field model,LSTM model and BERT model,this paper realized the research on automatic recognition of the structure of preamble in Tsinghua tree bank,and conducted experiments on processing the corpus in different ways,analyzed the experimental results,and explored the factors influencing the recognition effect of prepositional object structure.It is found that the different corpus division units in the identification of prepositional object structures have an impact on the recognition effect.Since the model is weak in determining the boundary of Chinese words,the recognition effect of word units is generally higher than that of word units.With the help of the automatic recognition model of the prepositional object structure of Tsinghua tree bank database and the artificial proofreading,the author constructs the anticipation of classical books and makes a statistical analysis of the part of speech in the prepositional object structure of the corpus of classical books.Finally,this paper studies on the identification of the structure of reference objects in classical books under different corpus processing with the help of LSTM model and BERT model,and realizes the automatic identification of the structure of reference objects in classical books.The best harmonic mean value of the model reaches 93.23%.Based on the constructed model,an automatic identification platform of the interface structure is built.

Keywords/Search Tags:

deep learning, the Records of History, ancient text information processing, prepositional object structures identification

PDF Full Text Request

Related items

1	Automatic Identification Of Chinese Prepositional Phrase Based On CRF
2	Image Sensitive Text Information Identification Based On Emotional Polarity Discrimination
3	Studies On Prepositional Phrase Boundary Identification Based On Usage Attribute
4	Research On Deep Learning Based Script Identification Method Of Korean History Documents
5	Automatic Identification Of Chinese Prepositional Phrase Based On Maximum Entropy
6	History Of China Since Ancient Archives .1980 Review And Outlook,
7	Natural Language Processing Of Ancient Books Of Chinese Traditional Medicine Based On Deep Learning
8	Research On The Recognition Of Multilingual Ancient Characters In Natural Scenes Based On Deep Learning
9	Research On Chinese Prepositional Phrase Identification Based On Multi-layer Conditional Random Fields
10	Design And Implementation Of Text Processing System Based On Deep Learning