| Chinese Grammatical Error Correction is the task of identifying and correcting grammatical errors in Chinese texts.In recent years,this research task has developed rapidly and has a wide range of applications in text assistant proofreading,text assistant editing,search engine,language recognition,image-to-text,etc.Nowadays,many natural language processing tasks benefit from the methods based on pre-trained language models,which also promotes the research in the area of Chinese grammatical error correction and improves the semantic extraction capacity of domain models.However,many studies do not make full use of the advantage of pre-trained language models to learn domain knowledge,resulting in their lack of representation capacity for Chinese grammatical error correction prior knowledge.Therefore,according to the features of Chinese grammatical error correction work,the thesis offers a new method for constructing the pre-trained language model grounded on syntactic knowledge graph.On the other hand,the processing of correction labels in the existing Chinese grammatical error correction model grounded on sequence tagging is too rough,which often leads to label conflicts such as insufficient generalization of the label set and multilabel.Aiming at resolving this kind of labeling problem,on account of pre-trained model,the thesis comes up with a Chinese grammatical error correction model grounded on the improved sequence tagging method.The following are the critical research works of this thesis.1.Constructing the pre-trained language model grounded on syntactic knowledge graph.In this thesis,we first construct a grammatical knowledge graph using syntactic relations between sentence components,and then utilize the Tuck ER model to train knowledge triples among knowledge graph to enhance the grammatical information expression ability of the entity representation.Finally,the entity representation containing external syntactic knowledge was deeply fused with the semantic representation through the knowledge fusion encoder.Besides,the mask method based on the characteristics of Chinese language is used to predict the selected characters and words for further learning semantic and grammatical knowledge.The experimental results reveal the effectiveness of the pre-trained language model for modeling grammatical knowledge,which is specific to the domain of Chinese grammatical error correction.2.On the basis of the pre-trained language model,constructing a Chinese grammatical error correction model grounded on the improved sequence tagging method.In this thesis,we first put the domain-specific inputs into the above pre-trained language model and finetune all parameters end-to-end.Then,a SWAP type tag is added to the existing grammatical error correction tags to simplify the tag set for word order errors,a multi-tag merging scheme is proposed to solve the problem of label conflict.Finally,two networks specific to the field of grammar error correction are introduced into the output layer of the model,and they joint learn and coordinate with each other to promote the model to learn the correct context in order to performing error correction at the appropriate position.The results of the experiment prove the influence of the improved sequence tagging method on the overall model,and show the suitability of the Chinese grammatical error correction model grounded on improved sequence tagging.3.Designing and developing a Chinese grammatical error correction platform.In this platform,the API interface of Chinese grammatical error correction is developed by using the trained model through the Flask framework,and the front-end grammatical error correction Web page is designed and developed through the Vue framework to realize the function of automatic proofreading of Chinese grammatical errors. |