| With the advancement of technology,news can be disseminated in many ways,and everyone can view the latest information through various social tools and news websites.On these social networking sites,users can also post their comments under the news that is pushed.Publishers sometimes publish messages anonymously,which has led to the appearance of various fake news on the Internet.These false news have the following characteristics: large scale,fast dissemination,and various counterfeiting methods.When some malicious fake news is serious,it will cause public panic,and it will also cause a certain degree of harm to enterprises,and even cause the government’s credibility to decline.Therefore,we need to stop as much as possible before the spread of false news.At present,when the news is being rumored,the public mainly reports false news through the masses,and then organizes relevant experts to review whether the news is false information.This model requires a lot of human and material resources,and this process requires a lot of time.For a lot of false information on the network,the work efficiency is extremely low.Therefore,we urgently need corresponding algorithms to automatically detect fake news.At present,the research on fake news in academic circles is mainly divided into two directions: one is based on social network modeling,and the other is content-based modeling.Among them,the research based on content modeling is mainly for text information such as headlines,texts,and comments in news,and for a news,it often contains many pictures of information.If the information of text and pictures can be considered comprehensively,the accuracy of detection may be improved to some extent.Therefore,the research in this paper is mainly based on multi-modal data such as text and pictures.The specific work is as follows:1.Preprocess the text data of the news,and then use the Text CNN model commonly used for text classification to detect,and use this result as the benchmark for this experiment.At the same time,an improved CNN structure is designed in this paper.The Softmax layer used for CNN classification is changed to a traditional machine learning model as a classifier.Finally,experiments have proved that CNN and machine learning models can achieve better detection results.Among them,CNN and XGBoost hybrid models have the best detection effect.Mixing models not only improves the accuracy of detection,but also increases the interpretability of the model;2.When detecting false news,a variety of modal features are used,such as structured user features,unstructured text and picture features.During the experiment,different features were modeled separately.Among them,F1 was 0.904 when text features were modeled,F1 after adding picture features was 0.912,and F1 after adding user features was 0.918.It can be found that it makes the best use of all the data in the news for modeling;3.During the experiment,some text and picture features that have a certain degree of discrimination between real and fake news are added,such as text length,keywords,symbol ratio,and other characteristics of the text,picture size and size,and detection of image compression.DCT features,which play an important role in tampering,have been improved by adding these features.In summary,this article designs an algorithm for false news detection based on a convolutional neural network and XGBoost hybrid model.This algorithm uses text data,image data,and user data in the news during detection for the subsequent false news detection work.It gives a new direction. |