Font Size: a A A

Research On Spam Review Detection Based On Heterogeneous Graph Representation Of Multiple Relationships

Posted on:2024-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:J N TangFull Text:PDF
GTID:2568307061485824Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Spam detection technology refers to a technology that can automatically distinguish between real and spam reviews from review data.With the rapid development of e-commerce,people have gradually shifted from offline shopping to online shopping and shared their opinions or reviews on products and services on social platforms.However,some illegal merchants hire spammers to engage in fraudulent reviews on social platforms to mislead consumers for profit.A large number of spammers are hidden among normal commentators,and this problem not only causes losses to consumers but also affects the reputation and business model of e-commerce platforms.Although researchers have proposed many spam detection methods,there are still some problems that need to be solved.These include(1)the common imbalance of review datasets in real-world situations,where the number of normal reviews is much higher than that of fake reviews,making the model learn more features of normal comment data and perform poorly in identifying fake comment features.(2)Faced with collective and interrelated spam groups,traditional machine learning-based detection methods rely only on the relevant features of comments themselves to classify classifiers,which are easily disguised by spammers,often ignoring the network structure and relationship information that is difficult to disguise between spam groups.(3)Secondly,in the large comment dataset composed of graph neural networks,the number of neighboring nodes of comment nodes is enormous.There are many noisy nodes among these neighboring nodes,and aggregating all neighboring nodes will also aggregate the information of these noisy nodes,resulting in poor detection performance.To address the above-mentioned problems,this paper proposes a spam detection method based on comparative supervised learning and GraphSAGE,and a spam detection method based on multi-relational heterogeneous graph and attention mechanism.The main work and contributions of this paper can be summarized as follows:First of all,this paper proposes a spam detection model based on GraphSAGE and contrastive self supervised learning,which solves the problem that the real review dataset is too large to handle and the sample imbalance exists in the dataset.The model efficiently handles large-scale graph data by constructing a single relationship graph and using the sampling and aggregation methods of the graph neural network GraphSAGE.The model can speed up model training through parallel operations.GraphSAGE aggregates the sampled neighboring node information and generates a fixed-length vector representation for each node.Then,we design a contrastive supervised learning module that uses node label data to spatially constrain the target node vector aggregated from the neighboring node information of the multi-relationship graph,bringing nodes of the same type closer together and those of different types farther apart.The intra-class nodes are aggregated,and the inter-class nodes are pulled apart,making the node vectors of each category more discriminative and beneficial to solve the imbalance phenomenon in the dataset and achieve better detection and classification performance.Secondly,this paper proposes another spam detection model based on multi relationship heterogeneous graph and attention mechanism,which solves the problem of feature camouflage,relationship camouflage,and neighbor noise nodes in false reviewers.The model improves the model by constructing a multi-relationship graph to replace the single relationship graph in the first model,which more comprehensively mines the graph structure information among spam groups and reduces false positives or false negatives caused by the inherent problems of a certain relationship graph.At the same time,the model designs a method based on label-aware cosine similarity to filter out some neighbor noise nodes of the target node,enabling the target node to retain potential same-type neighbor nodes to participate in the aggregation process.In the aggregation process,the model designs new message aggregation functions within and between multiple relationship graphs.These message aggregation functions introduce attention mechanisms that can aggregate more effective neighboring node information,optimize the message aggregation effect of graph neural networks,and improve detection performance.We conducted extensive experiments on the public review datasets Yelp and Amazon,using multiple metrics to compare the experimental results.The experimental results show that the two models we proposed achieve or exceed the majority of the benchmark models for fake review detection,and are more practical.
Keywords/Search Tags:Multi-relational Heterogeneous Graph, Graph Neural Network, Contrastive Supervised Learning, Attention Mechanism
PDF Full Text Request
Related items