| In recent years,with the continuous expansion of netizens and the rapid development of e-commerce,the number of product-related reviews grows rapidly,and becomes an important reference indicator for consumers.Due to the lack of supervision and restraint in the online review environment,some unscrupulous businesses attempt to interfere with consumers’ decision-making by publishing a large number of spam reviews.This approach seriously damages the ecological environment of online reviews and harms the personal interests of consumers.However,faced with tens of thousands of comment information on the website every day,it is not realistic to filter and screen fake review manually.Therefore,how to use computers to detect fake reviews accurately and efficiently in ecommerce websites becomes an important research direction in the current natural language processing field.In this study,we design a spam review detection system for online travel platform,using various technical methods which include data mining,neural network and web development,and show its practical application on identifying the spam reviews based on online Chinese review data.The main work of this paper is as follows:(1)This study designs and implements a spam review detection system based on B/S architecture.The system consists of six parts:data access module,data acquisition module,feature extraction module,detection algorithm module,web application module and system management module.The data access module provides a unified interface for the interaction between other modules and the database;the data acquisition module is responsible for regular incremental capture of the data on the target website,the data is standardized and saved to the database;the feature extraction module converts the raw data to the features;the detection algorithm module is responsible for the training and deployment of the model;the web application module builds web services relying on Flask framework which provides the users with three functions:comment detection,detection result visualization and detection result feedback;finally,the system management module provides the administrator with system data management and operating status management functions.(2)This study designs a spam review detection model that combines text semantic and sentiment.According to the characteristics of spam reviews in content and structure,the SSFM model uses recurrent neural networks to extract the deep semantic information and deep sentiment information of reviews.Based on semantic information and sentiment information,three information fusion methods based on pooling,convolution neural network and interactive attention mechanisms are proposed to construct a text representation vector,which contains rich semantic and sentiment information.At the same time,a feature fusion layer is constructed so that the text representation vector,user and merchant information features can be fused to improve the accuracy of the model.(3)In response to the lack of Chinese datasets in the field of spam review detection,this study collects and annotates the review data of all hotels in one area on Chinese online travel platform to construct a Chinese spam review detection dataset in hotel domain.Also aiming at the three main bodies of users,merchants,and review content involved in the process of publishing fake reviews,this study analyzes and extracts features from users’ information,merchant’ information,text semantic information and text sentiment information by data mining and text representation learning. |