| Online product reviews provide essential decision-making references for consumers and support them in making more informed purchase decisions.They are a key factor that influences user purchase decisions.However,unethical merchants often post fraudulent reviews to sway consumers’ purchase decisions for monetary benefits.Countless fake reviews seriously affect consumers’ assessments of product quality and disrupt the online consumption market’s order.Fake reviews are ubiquitous in people’s lives but are not easily discernible.Therefore,detecting fake reviews and establishing a clean online product review environment are highly significant for protecting consumers’ legal interests,reducing unfair competition in the consumption market,and maintaining normal market order.There are a large number of newly registered users in the real world.Due to the lack of sufficient review behavior and review sample data,detection systems have difficulty constructing effective behavioral features,resulting in difficulties in detecting reviews posted by new users,which is known as the cold start problem in detecting fake reviews.Due to the lack of behavioral features,the cold start problem is hard to overcome.Fake reviews that are not detected in time under the cold start problem can cause huge losses of benefits.There are some methods that aim to create behavioral features,however,they suffer from various shortcomings.Attribute feature-based approaches require high-quality labeled data that are difficult to obtain in reality,and the behavioral information modeled by potential interaction-based approaches is not sufficiently valid.When behavioral data are insufficient,how to explore other more powerful features or identify beneficial features via interactive behavior is the difficulty of the cold start problem in fake review detection.The thesis focuses on the cold start problem in deceptive review detection methods and the main contributions as well as innovative work is as follows:(1)The review text contains rich semantic information,and exploring deep semantic features can alleviate the feature scarcity problem in the cold start scenario.In response to the shortcomings of text representation methods for deceptive reviews in encoding deep semantics and processing long texts,an improved review text representation method based on fine-tuning XLNet is proposed.The method uses pre-trained XLNet to encode deep semantic features of words,and XLNet’s stronger modeling ability and advantage of handling long sequences ensure the effectiveness and robustness of the representation method in mining deceptive information.The attention pooling layer is introduced to discover the importance of each word,and the word information is aggregated according to the importance to generate review text representations.The calculation of the importance of words can help to better analyze the characteristics of real and fake review data.In the experiment,it is proven through a comparison of high-weight word frequency analysis that the proposed method can effectively analyze the characteristics of the dataset,and through a comparison with the baseline methods,it is proven that this method has good performance in representing review text.(2)The root cause of the cold-start problem in detecting fake reviews is the lack of behavioral features.In the graph network composed of reviewers,reviews and products,there are potential connections between newly registered reviewers and fake reviewers.By mining the information of neighboring nodes and the information of the group of fake reviewers,the behavioral features of newly registered reviewers can be constructed.A cold-start deceptive review detection method based on information aggregation is proposed to represent reviewer nodes by node-level information aggregation and meta-path-level information aggregation.The node-level information aggregation determines the importance of each neighbor based on node attention mechanism and path similarity information,and aggregates the behavioral information and product deception information of neighboring users.The meta-path-level information aggregation is based on the self-optimizing clustering module to mine the fake reviewer group information of the meta-path,and determine the importance of the meta-path based on the fake reviewer group information.Then the meta-path representations of nodes are aggregated to construct the behavioral features of newly registered reviewers.In the experimental part,the structural and semantic information of the meta-paths are analyzed by visualizing the meta-path representations of the nodes.Comparing with the baseline method,the good performance of several evaluation metrics proves that the behavioral features mined by the proposed method effectively solve the cold-start problem. |