Font Size: a A A

Research On Review Spammer Groups Detection Based On Deep Learning

Posted on:2023-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WuFull Text:PDF
GTID:2568306848962119Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the network information age,online shopping has shown explosive growth.When consumers are shopping online,online reviews have become an important source of information for consumers.Driven by enormous profits,many businesses will hire the spammers to make positive or negative reviews on the target products in an organized and premeditated way,which will affect consumers’ purchase decisions.Compared with spammers,spammer groups have a greater impact on consumers’ decision-making.In order to quickly and accurately detect spammer groups,researchers at home and abroad put forward solutions from different angles.These methods are mainly divided into two stages: generating candidate groups and detecting spammer groups.There are the following problems in the stage of generating candidate groups:(1)The information in the dataset can not be fully utilized.(2)The quality of generating candidate groups is poor.(3)It is difficult to determine the threshold size of candidate groups generated by artificial features.The following problems exist in the detection of spammer groups stage:(1)The differences between different indicators cannot be reflected.(2)The artificially designed group detection indicators are not universal.In order to solve the above problems,this paper studies from the following three aspects.Firstly,in order to solve the problem that the existing methods cannot fully utilize the information in the dataset in the stage of generating candidate groups and ignore the difference of indicators in the stage of detecting spammer groups,this paper proposes a spammer group detection method based on heterogeneous graph attention network.This method analyzes the original dataset from different perspectives and constructs a multi-relational user-product heterogeneous graph;The heterogeneous graph attention network model is constructed to obtain the low dimensional vector representation of nodes;Using the scan community discovery algorithm based on vector measurement,the candidate group is obtained;Spammer groups are detected by combining sorting algorithm and critic indicator weight calculation method.Secondly,in order to solve the problems of low quality of candidate groups generated by existing methods and the inapplicability of group indicators in the stage of detecting spammer groups,a spammer group detection method based on periodic subgraph mining and auto-encoder classifier is proposed.This method constructs user relationship subgraphs in different stages based on time periodicity;Using k-core algorithm to mine subgraphs and generate candidate groups;Sampling some normal groups as the training set,the doc2 vec model is trained to obtain the group behavior characteristics,which are input into the auto-encoder classifier model for training;The other groups are input into the trained doc2 vec model as test sets to obtain the group characteristics of test set,and the trained auto-encoder classifier model is used to detect spammer groups.Then,in view of the problem that thresholds such as time thresholds need to be set when using artificial feature engineering to generate candidate groups in the existing method,and the problem that artificially designed group indicators are not universal,a spammer group detection method based on Doc2 vec and teacher-student distillation network is proposed.In this method,the user’s review behavior is input into doc2 vec model as a word sequence to obtain the user vector representation;Constructing user relationship graph based on cosine similarity and greedy strategy;Using the connected component algorithm to generate candidate groups;Sampling some normal groups as the training set,the doc2 vec model is trained to obtain the behavior characteristics of the training set,which is input into the teacher-student distillation network for training;The remaining groups are input into the trained Doc2 vec model as the test set to obtain the characteristics of the test set,and the teacher-student distillation network model is used to detect the test set group.Finally,experiments are carried out on Amazon datasets and yelp datasets.The three detection methods proposed in this paper are compared with some existing methods to prove the effectiveness and rationality of the proposed methods.
Keywords/Search Tags:Spammer group detection, Heterogeneous graph attention network, Community discovery, Periodic subgraph mining, Knowledge distillation
PDF Full Text Request
Related items