Mining E-Commerce Fake Comment Based On Multi-Modal Features Fusion

Posted on:2023-06-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Zeng

Full Text:PDF

GTID:2568307151483734

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

With the booming development of e-commerce,online shopping is so convenient that has become a new popular way of shopping.A variety of shopping platforms come one after another,with online shopping springing up.In the meanwhile,product reviews have been taken as a significant reference for consumers before making consumption decisions.Nevertheless,the more heavily consumers rely on product reviews,the more fake reviews fill the entire online shopping environment.By this time,it is particularly important that consumers need a powerful tool that can help them identify fake reviews quickly and efficiently.The research of this paper is carried out under this background.The following are the main contents and innovations of this paper:First of all,a Chinese corpus for e-commerce mobile phone reviews has been constructed.So less is Chinese corpus can be used in the mining of the e-commerce fake reviews mining,that there exist no relatively complete Chinese e-commerce corpus as experimental support.Hence this paper obtained 10000 mobile phone review data using the web crawler technology,and manually categorized reviews after preprocessing them,the corpus is the basis of the experimental work in this paper.Secondly,exploratory analysis of e-commerce review data was conducted to extract multi-modal characteristics of the data.In recent years,shopping festivals have become a craze on various shopping platforms.This study explored whether there is a correlation between the authenticity of reviews and the release time of reviews during shopping festivals,and had proposed a new basic feature--festival time window,but also a significant difference exists between the text length distribution of the two types of reviews.The consequence of word frequency analysis manifests that the two kinds of reviews have their own characteristics in emotional tendency and text expression.After exploratory analysis,many modal characteristics had been extracted,such as the festival time window,text length,emotional tendency,and degree of the brand mentioned.And then respectively using the chi-square test and spearman’s rank correlation coefficient to test the independence among the dependent various and variable characteristics,the characteristics which were through the independence test would be selected for e-commerce fake reviews mining model training.Thirdly,a mining model of e-commerce fake reviews was established based on the machine learning algorithm.Existing e-commerce fake reviews mining rarely studies the text content itself as a semantic feature.In this paper,reviews’ word vector was considered as a semantic feature inputting into the training model.Firstly,the advantages and disadvantages of TF-IDF,Word2Vec,and BERT were compared.What had been proved is that the Word2Vec model is most suitable for training text word vector in the experimental corpus of this paper rather than other methods.After that,the trained Word2Vec word vector was combined with text sentiment as a semantic feature,and then the basic feature and keyword feature are combined,Naive Bayes,Logistic Regression,Support Vector Machine,Random Forest,and AdaBoost were respectively used for training,an e-commerce fake reviews mining model with the best performance had been obtained finally.

Keywords/Search Tags:

Fake reviews mining, Multi-modal fusion, Text classification, Machine learning

PDF Full Text Request

Related items

1	Fake Product Reviews Identification Based On Deep Learning
2	Research And Implementation Of Fake News Intelligent Identification Technology Based On Heterogeneous Multi-modal Data
3	Research On Fake Review Identification Based On Text And User Behaviour Mining
4	Multi-Label Text Classification And Topic Mining Based On Live E-Commerce Return Reviews
5	Multi-modal Learning Based On Single-modal And Multi-modal Data
6	Research On Techniques For Free Text Classification
7	A Fake Reviews Detection Model Based On Convolutional Neural Network And GRU
8	Fake Comments Based On Fusion Feature Detection Algorithm
9	Research And Implementation Of Multi Model Fake News Classification System Based On Bert
10	Research On Mining Techniques Of Product Reviews Based On Multi-Document Summarization