Font Size: a A A

A Study On Short Text Classification Based On Deep Learning

Posted on:2022-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:J Q QiFull Text:PDF
GTID:2518306350495504Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the Internet era,short text information is growing exponentially,and news information from short texts bears the brunt.Faced with lots of document information,manual methods to view each document,understand its contents,and perform identification and classification are no longer satisfactory.In response to the above phenomenon,this thesis starts with improving the accuracy of short text classification,and improves and integrates the current application of deep learning methods in the research field of short text classification and its extensive models,so as to obtain better short texts.This thesis comes up with short text classification models by experiments to prove its effectiveness and superiority.The specific research content of this thesis is as follows:(1)Short text representation is an important work in short text classification,and it provides prior knowledge that directly affects the performance of the short text classification model.The common methods based on word2vec and the Bidirectional Encoder Representations from Transformers(BERT)word vector technology have error propagation problems in the process of text representation.In response to this problem,this thesis adopts the enhanced representation through knowledge integration(ERNIE)model which proposed by Baidu at the end of 2019,and integrates Convolution Neural Network(CNN)to construct a novel short text classification model named ERNIE-CNN.Its working principle is to use the enhanced representation through knowledge integration model to finish short text pre-training to obtain word vectors,and use low-dimensional word vectors as the input of the model in the neural network input layer to improve the predictive ability of the neural network and improve the accuracy of short text classification.The pytorch deep learning framework is used to construct a convolutional neural network based on the enhanced representation through knowledge integration model for experiments.Meanwhile,the results are compared and analyzed experimentally with the traditional convolutional neural network model using word2vec convolutional neural network model and BERT convolutional neural network model.The results show that the ERNIE-CNN short text classification model proposed in this paper has the best effect.(2)On the basis that the enhanced representation through knowledge integration model can be used as the input of the convolutional neural network to improve the accuracy of short text classification,this paper combines the common deep learning model for solving short text classification and the enhanced representation through knowledge integration model to compare a short text classification model that is most suitable for short text classification field based on the enhanced representation through knowledge integration model.Under the same experimental environment and data sets,the experiment compares the enhanced representation through knowledge integration model,the enhanced representation through knowledge integration model-convolutional neural network model,the enhanced representation through knowledge integration model-circular convolutional neural network model and the enhanced representation through knowledge integration model-deep pyramid convolutional neural network model these four short text classification models found that the enhanced representation through knowledge integration model and the deep pyramid convolutional neural network model are fused based on the enhanced representation through knowledge integration model-deep pyramid convolutional neural network model effect better.
Keywords/Search Tags:Short Text Classification, 'ERNIE' Model, Convolution Neural Network, Deep Learning, Deep Pyramid Convolutional Neural Network
PDF Full Text Request
Related items