Font Size: a A A

Construction Of The Review Spam Classification Model Of The Third-party Review Websites

Posted on:2019-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:Q WuFull Text:PDF
GTID:2429330548983371Subject:Information Science
Abstract/Summary:PDF Full Text Request
The rapid development of third-party review websites represented by dianping.com has set off an upsurge of users' reviews on the Internet,and has also made more and more consumers develop the habit of reading reviews before making decisions.These online reviews,as a feedback to the consumer product experience,contain a lot of important information.The information expressed in the reviews is likely to affect the individual's perceptions and decision-making behavior,because people will have a herd mentality under the influence of group wisdom.However,as a website where people are free to participate in the review,dianping.com has a large number of active users and online reviews,inevitably there will has spam reviews such as malicious reviews,irrelevant reviews and so on.These spam reviews are confusing and,to a certain extent,affect the reference value of the review information,leading to potential consumers making wrong judgments.In the event of the U.S.presidential campaign,the Online Water Army has played a pivotal role.It can be seen that whether the review is true or not has a significant impact on users' decision-making behavior.It is particularly important to discover and identify spam reviews in a timely manner.Therefore,the purpose of this paper is to build a model for detecting spam reviews based on the classification algorithm of machine learning,so as to reduce the cost of identifying spam reviews on dianping.com and improve recognition efficiency.In terms of research methods,this paper uses machine learning combined with empirical analysis methods.Based on the combing and analysis of domestic and foreign related literature,through research steps such as data collection and cleaning,natural language processing,sentiment analysis,and feature mining.Based on data characteristics and data size,we build a classification model based on naive bayesian algorithm.In the process of constructing the model,different classification models are constructed by different combination of characteristics,and a total of 92 classification models are obtained.By comparing the test accuracy,precision rate,recall rate and F1 value of each model,the model ten was selected as the classification model constructed in this study.The test accuracy and F1 value of the model are 76.13%and 76%respectively,which indicates that the model has good performance and classification effect.Finally,based on the obtained classification model,this paper discusses the benefits of the model from three perspectives,including dianping.com,shops and consumers.Using this classification model can not only reduce the proportion of spam reviews in the dianping.com,but also provide effective information for shops to improve their own shortcomings,and provide a more reliable reference for consumers' decision-making behavior.
Keywords/Search Tags:dianping.com, review spam, machine learning
PDF Full Text Request
Related items