Research On News Text Classification Method Based On Hybrid Model

Posted on:2024-03-01

Degree:Master

Type:Thesis

Country:China

Candidate:T Q Chu

Full Text:PDF

GTID:2568307082462054

Subject:Electronic Information (Computer Technology) (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Text classification is a key technology in the field of natural language processing.It is one of the current hot research fields and it is also a research difficulty.On the one hand,there is relatively little information on Chinese text classification in the corpus and on the other hand,Chinese is relatively complex and difficult to recognize compared to English.Therefore,it is difficult to extract features using traditional methods and machine learning algorithms require manual feature extraction for data classification.Compared to machine learning,deep learning can simplify the part of feature extraction and solve the problem of high-dimensional and sparse matrices,improving the accuracy of text classification.The text is mainly analyzed and compared on the news text dataset by integrating several main models in deep learning.The specific work is as follows:1.A network model based on LSTM-CNN-Attention is proposed to solve the problem that traditional convolutional neural network and short-term memory network can not extract text features well.Firstly,the Word2 vector model is used to obtain the word vector representation of the text in structure.Secondly,the LSTM network is used to extract the context information of the full text.Then the LSTM output and the original output are combined to obtain new features.Finally,the multi-channel CNN-Attention structure is used to extract local features.The experimental results show that the classification effect of this model is better,and the accuracy rate is 90.3%,the accuracy is 87.1%,the recall rate is 87.5%,and the F1 value is 86.7% on the Netease news dataset.Compared with other models,this model has improved in four indicators.2.In view of the limitations of the Long Short-Term Memory in extracting local information in text classification,a text classification model integrating LSTM-Attention and CNN is proposed.In terms of structure,LSTM is first used to extract global sequence information,and then the weight is added to the output of LSTM through attention mechanism,and then the local information of the original text is extracted through three-layer convolution neural network.In addition,the convolution neural network adopts a serial structure and selectively fuses the original input information with the output of CNN,Finally,combine the output information of the two to get new features and use softmax to get the probability of each category.Finally,the accuracy rate on the Thucnews dataset reached 96.8%,accuracy 96.8%,recall 96.7%,F1 value 96.9%.

Keywords/Search Tags:

text classification, Convolution neural network, Short and long term memory network, Attention mechanism

PDF Full Text Request

Related items

1	Application Of Short Term And Long Term Memory Neural Network In Stock Trend Prediction
2	Text Classification Research Based On Deep Neural Network And Attention Mechanism
3	Research On Text Classification Method Combining Attention Mechanism And Bi-GRU
4	Research Of Online Comment Text Sentiment Classification Based On Long-short Term Memory Network
5	Research On Text Classification Of Chinese News Based On Deep Learning
6	Short Text Sentiment Classification Based On Deep Learning
7	Text Sentiment Classification Based On Attention Mechanism
8	Research On Chinese Text Classification Method Based On Long And Short Term Memory Network
9	Research On Relation Classification Via Bidirectional Long Short-Term Memory Networks With Attention Mechanism
10	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network