Research On Fusion Multi-feature Hotel Reviews Classification Algorithm Based On Neural Networks

Posted on:2022-12-11

Degree:Master

Type:Thesis

Country:China

Candidate:S J Zhou

Full Text:PDF

GTID:2518306779996079

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

The popularity of the Internet has made online reviews a valuable information resource available to people.With the development of e-commerce,the amount of product review information has surged,and some of these reviews are deliberately fabricated or have no reference value.Traditional research on spam reviews detection is mostly based on the reviews text itself,which does not take the features of reviewers into consideration,resulting in low recognition accuracy.Therefore,this thesis proposes a spam reviews recognition method integrating global-local attention mechanism and combining multiple features based on neural network.In this thesis,Yelp hotel review data set is used for experiments,and the proposed model integrates review text features and reviewer features to identify spam reviews,so as to effectively classify spam reviews from real reviews.First of all,the text representation for the reviews text,due to the traditional way of word embedded polysemy problems cause unable to get accurately the semantic information of text,this thesis uses BERT pre-training language model,the training of the model includes the location of the text and sequence information,use of a bidirectional Transformer encoder to obtain text semantic characteristics,The representation matrix of the reviews text is obtained through training.Matrix,then step on to get to ignore the noise and unrelated words from the text,to get what words in the global scope is more informative and global features of capture text,use the global attention mechanism for words to give the corresponding weights,global attention mechanism when calculating the context vector each step,all need to consider the encoder position of state variable,Considering each hidden state of the encoder,the feature representation matrix of the text in global attention is obtained.Which words in order to get the local scope is more informative and use the local attention mechanism for words to give the corresponding weights,different from the global attention mechanism,a context window is needed here,when the word in the middle of the window location,consider only hide status before and after a certain range,the higher the concentration value represents the word has more information,In this way,the feature representation matrix of text in local attention is obtained.The two matrices were extracted with three convolution kernels of different sizes respectively,and then the maximum pooling strategy was used to reduce the matrix to obtain the most significant features in the text representation,thus obtaining two new matrices.For the reviewer feature,it is formed into a one-dimensional vector,and then normalized.After three fully connected layers,it is connected with the two feature representation matrices obtained in the previous step in the same dimension to form a new vector integrating multiple features.After three fully connected layers,The final fully connected layer uses the Sigmoid activation function to perform the final classification task.In this thesis,reviews and reviewers are integrated,and the influence of both on spam reviews recognition is considered comprehensively.In terms of text training,BERT pre-training language model is used to obtain more accurate text representation,and global-local attention mechanism is used to distinguish the importance of words.Compared with the traditional convolutional neural network model and some relatively new models on the Yelp hotel review data set,the garbage identification performance of the model in this thesis has been improved to a certain extent,with the accuracy,precision,recall and F1 value reaching 90.24%,90.54%,89.16% and 89.84% respectively.Ablation experiments have been conducted,and the experimental results are in line with expectations.The validity of the model is proved.

Keywords/Search Tags:

Spam identification, Neural network, Attention mechanism, Classification

PDF Full Text Request

Related items

1	Research And Analysis Of Spam Classification Based On CNN Two-Way LSTM Attention Mechanism
2	Resrarch And Implementation Of Spam Review Detection System Based On Nerual Network
3	Research On Spam Review Detection Based On Integrated Multi-feature
4	Research And Implementation Of Spam Review Detection System Integrating Text Semantic And Sentiment
5	Research On Text Classification Method And Its Interpretability Based On Attention Mechanism
6	A Research Of Sentiment Classification Algorithm And Application Based On Attention Mechanism
7	Research On Text Classification Model Based On BGRU And Self-Attention Mechanism
8	Research On Review Spam Detection Based On Hierarchical Neural Network And Multivariate Features
9	Text Representation And Classification Based On Deep Learning With Improved Attention Mechanism
10	Text Classification Research Based On Deep Neural Network And Attention Mechanism