Font Size: a A A

Research And Implementation Of Text Classification Based On ERNIE And TextGCN

Posted on:2023-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y GaoFull Text:PDF
GTID:2568306791952879Subject:Engineering
Abstract/Summary:PDF Full Text Request
The continuous development of digital media technology makes text information,and now there is no need to meet people’s needs.The text recommendation system came into being,and the accuracy of the text classification has played a decisive role.The text classification algorithm is a method of predicting the text data category based on the features extracted in the original text data.However,the existing text classification algorithm is affected by various problems such as relationship between the remote text and boundary data,heterogeneous data,etc.,resulting in inaccurate text classification.Therefore,this paper develops traditional depth learning methods and text classification methods of the nerve network,and improved two kinds of algorithm network models,designing an EDA-based Chinese text classification algorithm and an L-Text GCN English text classification algorithm to improve the efficient of Chinese and English text classification.The main contribution of the paper is as follows:(1)In order to improve the text classification under the influence of the text semantic information lacking and obtaining a long-distance text association information,this paper proposes a Chinese text classification neural network(EDA)based on optimized Ernie pre-training model.First,the model enriches Chinese text semantics information by using the Ernie pre-training model to get a better text representation;then,enrich semantic text information is entered into the long convolution network of the improved depth of this article,so that it can be better.The text association information between text long distances will be extracted;secondly,the output result information is input to a documented self-focus mechanism,and the weight value corresponding to the text is obtained for each location of the text;ultimately the text is classified.The experimental results show that the accuracy of the network model proposed in this article increased by 4.68% compared to the Bert model,the loss rate is reduced by the BERT model,and the loss rate is reduced by 0.1.(2)For heterogeneous text information present in English text,this paper uses a graph neural network to build an association structure between documents and words,word and words to obtain hidden information between the two.In order to make the results of the English text classification more accurate,this paper proposes an English text classification network model(L-Text GCN)based on optimized graph convolutional network.Firstly,the Mish activation function is used to improve the hard-zero boundary data,which solves the problem of gradient disappearance,so that the negative information in the text data can be better penetrated into the neural network.Secondly,by modifying the sliding average of the second-order gradient and removing the first-order gradient of momentum in parameter optimization,by increasing the contribution value of the parameter gradient at the previous moment,the proportion of the historic cumulative gradient in the network becomes larger and larger,and finally the long-term memory effect of the convolution network is achieved;Finally,the accuracy of the classification results of the L-Text GCN model proposed in this paper was improved by 0.16%,0.78%,0.79%,and 0.32% through experimental comparison of several datasets.(3)Based on the above work,design and development of a Chinese and English news text classification system prototype.By increasing the accuracy rate of Chinese and English text classification,the news text corresponding to the more accurate text category will be recommended to users to improve the experience and convenience of users in reading news.To sum up,through experiments on the existing Chinese and English datasets,it is proved that the proposed Chinese and English text classification algorithm model has more efficient classification results,and at the same time can quickly recommend various types of news to users,which greatly improves the user experience.
Keywords/Search Tags:Text Classification, Deep Learning, Graph Convolution Network, Attention Mechanism, Pre-Training Model
PDF Full Text Request
Related items