| In modern times,along with the high-speed development of digital economy and computer technology,many social media platforms and e-commerce platforms have gradually emerged,where people exchange and share various messages,and these messages appear in the form of short texts,which has led to the explosive growth of data in the form of short texts.Therefore,it is very important to analyze and process the short text data to find out the connotation behind the data,so short text classification has become a meaningful research direction.Short text data has the problems of short text and noisy text,which leads to the problems of sparse feature representation and poor feature expression in the traditional short text classification process.To this end,this paper takes short textbook data as the research object,analyzes the characteristics of short textbook data in depth,and analyzes the shortcomings of existing short textbook classification models and methods in depth,and proposes two feature enhancement models for short textbook classification,and the specific work can be outlined as:The first one is CNN-UN(CNN-unsample),a convolutional neural network-based feature enhancement model.In the CNN-UN model,first,after extracting deep features of different dimensions using multi-scale convolutional neural networks.We propose a feature enhancement approach combining upsampling and downsampling,first using upsampling to expand feature vectors to enhance the text semantic feature representation of short texts,followed by further key feature selection of the text feature representation using downsampling convolution of the expanded features.Finally,the deep key features of the text are used for classification.The second one is the feature enhancement model Bert-VAE,which combines Bert model and variational auto-encoder model.In the Bert-VAE model,firstly,the pretrained Bert model is used to obtain a rich and comprehensive text feature representation.Then,to address the problem of feature sparsity in short texts,we further improve the feature representation performance of text by generating enhanced sample features using variational auto-encoder and the excellent performance of Bert encoding features.Finally fuse Bert text features and enhanced features to predict text categories.In summary,the experiments on the classification dataset of news text headlines demonstrate that the two feature enhancement approaches proposed in this paper have better performance and significantly improve the performance of the model in the short text classification task to some extent. |