| As the core of Internet text processing and text mining,text classification has become a key research issue in the field of natural language processing.Faced with the explosive growth of various text data on the Internet,how to effectively use these textual data and unearth the true value behind it is of great significance.For text classification,the traditional method is mainly shallow machine learning.With the rapid development of deep learning technology,it has made great breakthroughs in the field of image recognition and speech recognition.The feature learning ability of the depth model has been further proved.Deep-learning based on the Convolution Neural Network(CNN)model studies news text classification.The specific research content and results are as follows:1.In the Chinese word segmentation,for the specificity of Chinese texts and the directionality of the field studied in this article,this article adopts the Python language based Jieba word segmentation technology.In order to achieve a better word segmentation effect,it is based on the relevant professional vocabulary in the news field.A simple extension of the Jieba participle dictionary.2.In order to avoid the disadvantages of traditional feature extraction based on artificial experience,this paper adopts the Skip-Gram model to represent word vector features after Chinese word segmentation to form the word embedding word vector representation form of each word,and finally it will train well.The word embedding is vertically stacked as a distributed feature of each news text and entered into a Convolutional neural network model in the form of a two-dimensional matrix.3.This paper attempts to introduce deep learning related theories and designs a Convolutional neural network model to achieve the task of classifying news texts.It overcomes the shallow machine learning that ignores the semantic relationship between words and words,and it is easy to fall into the local optimum.4.In the comparative experimental design process,in order to find a suitable word vector dimension and convolution kernel size,two sets of different dimensions and convolution kernel sizes are set to experiment.The experimental results show that the word vector dimension is taken as 128,and the convolution kernel size is taken as 3,4,and 5 would get the best results.In order to prove the effect of news text classification based on convolutional neural network,this method is compared with shallow machine learning algorithm and Gaussian initialized convolutional neural network model.The experimental results show that the Convolutional neural network model can overcome the shortcomings of shallow machine learning in text classification and improve the accuracy of news text classification. |