Font Size: a A A

Research On Chinese Spelling Error Correction Model Based On Deep Learning

Posted on:2024-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Y GuoFull Text:PDF
GTID:2568307079991329Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The task of Chinese spelling error correction is one of the hotspots in the field of natural language processing,which detects and corrects spelling errors in texts.Common spelling errors can be divided into three types: phonetic errors,visual errors and other errors.Correcting spelling errors can improve the correctness of the language and reduce the cost of manual proofreading.In this paper,we use open data sets to study Chinese spelling error correction based on deep learning.Firstly,preprocessing is used on the data to remove the interfering content,and then enhancing is used on the data by adding more spelling error information to the data set.In the aspect of model building,an end-to-end Chinese spelling error correction model is built by using several deep learning models,which is composed of three parts of networks.First,the bi-directional LSTM model is used as the detection network to predict the probability of Chinese character errors,then the unidirectional LSTM model is used as the feature extraction network to extract the pinyin and glyph information of Chinese characters,and finally the BERT pre-training model is used as the correction network.The spelling errors are corrected by combining the information provided by the detection network and the feature extraction network.The experimental results show that the error correction accuracy of this model under sentence granularity is improved by 3.19% after data enhancement,which proves the effectiveness of the data enhancement method.Compared with other deep learning models,this model has certain advantages in the accuracy,recall rate and F1 score under the granularity of characters and sentences,and improves the performance of spelling error correction tasks.
Keywords/Search Tags:natural language processing, BERT model, LSTM model, Chinese spelling error correction, data enhancement, deep learning
PDF Full Text Request
Related items