Font Size: a A A

Research On Fake Reviews Based On Semi-supervised Learning

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:M H WangFull Text:PDF
GTID:2429330572455375Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of online shopping,consumers have gradually shifted from traditional offline shopping to more convenient online shopping,and they have got used to commenting on purchased products.The e-commerce platform has gradually accumulated online product reviews,which provide valuable data resources for businesses,potential consumers and researchers.Because the online review information can influence the consumer's shopping decision,and then affect the product sales,so some fake contents appear in the product review.These fake reviews confused consumers,reduced the reference value of online reviews.Therefore,it is very important to identify fake reviews.Online product reviews are the most representative of commentary information and are ideal data sources for fake reviews detection issues.Due to the fact that there are few unlabeled data and many labeled data in online product reviews,semi-supervised learning method is a mainstream machine learning method that can effectively use unlabeled data.Therefore,this paper applied a semi-supervised learning method to fake reviews detection tasks.This paper researches and analyzes the fake reviews detection technology and the semi-supervised learning method.Firstly,it introduces the research status and development trend of the fake reviews detection problem.Then it introduces the semi-supervised learning principle and its classification method.Finally,the three mainstream algorithms in the semi-supervised learning domain: Co-Training,Tri-Training,and Co-Forest are applied to fake review recognition tasks.This paper focuses on online product review,semi-supervised learning,and fake reviews detection issues.The main research work is as follows:(1)A semi-supervised learning method is proposed to solve the detection task of fake review.According to the fact that there are many unlabeled data and few labeled data in the online review,this article uses the semi-supervised learning ideas using the three mainstream algorithms in the field,iteratively training multiple classifiers,and making full use of unlabeled data.Expand the labeled training set and use these training sets to update the classification model to improve the model's performance.Finally,an experiment was conducted on the Amazon review data.The results show that the semi-supervised learning algorithm has a better recognition effect on fake reviews.(2)In the feature extraction phase,combining the subject of the review and the textual information,and based on the statistical analysis of the review data set,this article analyzes and extracts from three perspectives: the review text,the reviewer,and the product,3 types and 22 dimensional mixed features.Finally,based on different feature combinations,three kinds of supervised learning algorithms are used: Naive Bayes,Maximum Entropy and Support Vector Machine.The recognition performance of different feature combinations under different classifiers is tested.The results show that the hybrid feature prediction effect is better,and the Naive Bayes classifier obtains the best recognition effect and applies it to the subsequent fake review detection model.
Keywords/Search Tags:Semi-Supervised Learning, Fake Review Detection, Co-Training, Tri-Training, Co-Forest
PDF Full Text Request
Related items