| With the improvement of people's living standard and the development of the Internet,shopping has become an indispensable part of people's life.Most people will read the comments of the products to have a preliminary understanding of the price,performance and other aspects of all the products purchased before the Internet shopping.In the field of product reviews,People are more inclined to express themselves on such online shopping.Those expressions of the users are rich in varied and useful information.However,there are a lot of spam comments in product reviews,and these spam reviews can affect the user's purchase expectations.Therefore,it is a very necessary research to analyze the features of spam review and carry out effective identification and detection.This paper comes up with an identification way of the spam in the Chinese product reviews.The main works are as follows:(1)First of all,we elaborated on the relevant knowledge of spam review,analyzed the existing deficiencies of existing spam review methods,and based on the analysis of review features,combined relevant research to separate spam comments into content type.There are two types of spam comments and fake spam comments,and different detection methods are designed for these two spam comments.(2)From the perspective of product review content,a content-based spam review method based on fuzzy support vector machine is proposed.In order to solve the problem of vector dimension disasters of large-scale commentary data sets,the idea of combining the LSA method with the FSVM algorithm is further proposed.Without affecting the construction of the optimal classifier hyperplane by FSVM,the LSA method is used to remove the noise words in the comment data set and the words with the same or extremely similar implicit semantics,so as to reduce the dimension of the classification training,and through the experiment.The proposed method is verified.(3)Based on the analysis of false spam detection problems,modeling was carried out from both the content of the review and the user behavior of the review,and six kinds of features that can be updated at any time in the detection process were constructed.Based on this,two on-line detection methods for spam review are designed,which are supervised and unsupervised.Finally,the experiments verify the effectiveness of the proposed method in the detection of content-spam comments and spurious spam comments.The result is satisfactory and has a good prospect of application... |