| The Internet is a new type of information exchange tool with the development of the times,and now it is strongly associated with our life.Meanwhile,an increasing number of interactive platforms have appeared on the Internet,and a variety of comment text has emerged with explosive grow tendency as well.Thus,there is great significance to classify these texts and to study the potential value that was hiding in these texts.The commentary texts consist of all kinds of information,such as:the negative and positive comments over the items,and the emotional information.The classification over this kind of text is mainly based on emotional and public opinion.Nowadays,a growing number of attentions from many scholars are paid to this task.At present,there are mainly two kinds of methods for text orientation classification.The first one was strong associated with the dictionary and corpus.This method achieved a good results in text classification,but this method heavily depended on external elements such as dictionaries,and the classification results strongly rely on the external elements such as dictionaries.Moreover,the computational burden is great.The second method is to utilize machine learning method,this method combines feature extraction approach and then use a special way to express the text and to conduct the text orientation classification via these methods.In general,different machine learning methods produce different classification effect,while it is also easy to be affected by the quality of the text.In this paper,the paper commentary text and e-commerce website commentary text are served as carrier to carry out the tendency classification.Combining the features of different texts,we collected a set of data with more noises from forum and a set of normative comment over e-shopping.The commentary text of the forum and website comment text possesses the complex structure,diverse language style.Basing on this phenomenon,firstly,the text has to be processed.,and then word segmentation dictionary was specifically constructed,finally,combine text vector representation method and X 2 statistic feature extraction method to construct the vector matrix of the text.Due to the complexity of the commentary text over forum and e-commerce website and the noise,the text preprocessing with machine learning was chosen to classify text trends.Considering the characteristics of a variety of machine learning methods,BP neural network model was chosen as the text orientation classification model.In the training process of text classification model,the contrast experiments are conducted over the different structure with one or to layers via different data sets.Finally,the model,presenting the best results,was applied to tendency classification.With the analysis of the experimental results,it is strongly revealed that the BP neural network classification model shows the strong ability of fault tolerance for noisy data,and the classification effect is the best. |