Font Size: a A A

Research On Classification Algorithm Of Natural Language Text Based On Deep Learning

Posted on:2024-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:X H DaiFull Text:PDF
GTID:2568307160955589Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the downstream tasks in the field of natural language processing,the accuracy and rate of text classification will directly reflect the performance of text classification methods.Most of the traditional text classification methods had a shallow structure and incorporate a large number of artificial features,which was not only timeconsuming and labor-intensive,but also has the problems of sparse data and high dimensions.With the rapid development of artificial intelligence,the method of deep learning based on neural network has solved more and more problems in the field of natural language processing,and has achieved good performance.Therefore,Researchers are also paying more and more attention to introducing deep learning methods into the field of text classification.However,there are still some problems in the existing deep learning-based methods,such as the inability of a single neural network to obtain accurate global text information,and the classification model always reads all the input content when reading text information,and the long sequence input process is slow,etc.In response to these problems,the main work and innovations of this thesis in text classification based on deep learning are as follows:(1)Aiming at the inability of a single neural network to accurately obtain global text information,a text classification model combining Att-BiLSTM dual channel and CNN(Att-BiLSTM-CNN)was proposed.In the Att-BiLSTM layer,the attention mechanism layer was used to assign the weight coefficient of the text information from the upstream embedding layer,focusing on the text information that affects the classification;the BiLSTM layer was used to extract contextual semantic information in the text to obtain more text features.After the dual-channel layer,the CNN layer was added to perform convolution and pooling operations on the deep advanced semantics,extracting the abstract features adjacent to the text,and obtained the final text feature representation.In order to verify the performance of the model,the classification accuracy of the model reached 89.87% on the public data set,and the experimental results showed that the model is better than other comparison models.(2)In view of the fact that the model always reads all the input content in the text classification task,and the long sequence input process is slow.Starting with the improved Self-Attention mechanism and the GRU network,a text classification model improved Self-Attention mechanism and a Skip GRU network with CNN(SAtt-SGRUCNN)was proposed.When reading text in text classification tasks,a large number of words are irrelevant.To this end,Skip-GRU,an enhanced model of GRU,skipped content that is not important to the classification task when reading text information,and only captured and learned effective of text sequences;the improved Self-Attention mechanism was introduced to redistribute the weight of the deep sequence of the text,and the optimized multi-channel CNN was used to extract the local features of the text.In order to verify the performance of the model,experiments were conducted on public data sets,and the classification accuracy of the model on three different data sets was no less than90%.The experimental results showed that the SAtt-SGRU-CNN model can effectively capture the text features that are more important for classification results,and has better classification performance.
Keywords/Search Tags:Text Classification, Deep Learning, Bidirectional Long Short-Term Memory Networks, Gated Recurrent Units, Convolutional Neural Networks
PDF Full Text Request
Related items