| With the rapid development of deep learning,this technology has brought new breakthroughs for text classification,among which the BERT model-based text data processing method has become the mainstream.This method first carries out pre-training in large-scale corpus and then can be fine-adjusted according to different downstream tasks.Compared with traditional deep learning methods,this kind of method has better performance and portability.Considering that Chinese mainly expresses semantics through words,the word mask task based on word granularity in the original BERT model can not relate the context well.Through experiments,this thesis compares the text classification effect of BERT model based on whole word MASK,original BERT model and other common deep learning models.The experimental results show that BERT model based on whole word MASK performs better than other models.Therefore,this thesis focuses on the optimization of BERT model based on the whole word MASK.Most of the text classification methods only focus on the in-depth study of a single model,and each single model has advantages and disadvantages,unable to capture the global semantic features and local semantic features at the same time,and the deepening of the depth of the network,easy to cause semantic loss.Therefore,a fusion model is proposed in this thesis.Based on BERT model which adopts full-word MASK sample generation strategy,it integrates the advantages of CNN and Bi GRU in text modeling to obtain more comprehensive semantic features for text classification.Firstly,the original6 Transformer layers of BERT model are removed,and the feature representation of the text is obtained through the BERT model.Then,the local semantic features are extracted by CNN and the global semantic features are extracted by Bi GRU.Finally,the model uses the feature fusion vectors of the two channels for text classification.In order to improve the quality of news text corpus and prepare for improved model comparison experiments,this thesis also builds a small Net Ease news data set through web crawler technology.The final experimental results show that the improved fusion model achieves higher accuracy without increasing the number of parameters,which proves the effectiveness of the fusion model in the task of news text classification. |