A Study Of Short Text Classification Based On Feature Enhancement

Posted on:2024-06-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y G Fang

Full Text:PDF

GTID:2568307097450294

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In modern times,along with the high-speed development of digital economy and computer technology,many social media platforms and e-commerce platforms have gradually emerged,where people exchange and share various messages,and these messages appear in the form of short texts,which has led to the explosive growth of data in the form of short texts.Therefore,it is very important to analyze and process the short text data to find out the connotation behind the data,so short text classification has become a meaningful research direction.Short text data has the problems of short text and noisy text,which leads to the problems of sparse feature representation and poor feature expression in the traditional short text classification process.To this end,this paper takes short textbook data as the research object,analyzes the characteristics of short textbook data in depth,and analyzes the shortcomings of existing short textbook classification models and methods in depth,and proposes two feature enhancement models for short textbook classification,and the specific work can be outlined as:The first one is CNN-UN(CNN-unsample),a convolutional neural network-based feature enhancement model.In the CNN-UN model,first,after extracting deep features of different dimensions using multi-scale convolutional neural networks.We propose a feature enhancement approach combining upsampling and downsampling,first using upsampling to expand feature vectors to enhance the text semantic feature representation of short texts,followed by further key feature selection of the text feature representation using downsampling convolution of the expanded features.Finally,the deep key features of the text are used for classification.The second one is the feature enhancement model Bert-VAE,which combines Bert model and variational auto-encoder model.In the Bert-VAE model,firstly,the pretrained Bert model is used to obtain a rich and comprehensive text feature representation.Then,to address the problem of feature sparsity in short texts,we further improve the feature representation performance of text by generating enhanced sample features using variational auto-encoder and the excellent performance of Bert encoding features.Finally fuse Bert text features and enhanced features to predict text categories.In summary,the experiments on the classification dataset of news text headlines demonstrate that the two feature enhancement approaches proposed in this paper have better performance and significantly improve the performance of the model in the short text classification task to some extent.

Keywords/Search Tags:

Short text classification, Feature enhancement, Un-sample, Bert model, Variational auto-encoder

PDF Full Text Request

Related items

1	Research On Neural Topic Modeling Method Based On Variational Auto-Encoder
2	Research On Short Text Classification Method Based On Improved BERT Mode
3	Research On Short Text Classification Method Based On Contextual Feature Expression
4	Image And Text Joint Modeling Method Based On Multimodal Weibull Variational Auto-Encoder
5	Research On Short Text Classification Technology Based On Deep Learning
6	Classification Of News Short Text Based On Deep Learning
7	Deep Auto-encoder Framework For SAR Images Change Detection
8	Research And Application Of Representation Learning Based On Variational Auto-encoder
9	Research On Short Text Classification Of Semi-supervised Pre-training Based On Autoencoders And Word Order Dependencies
10	Clustering Analysis Based On Variational Auto-encoder And Mixture Model