| Technological advances in deep learning have led to a new wave of artificial intelligence,while also bringing new potential threats to national security.In recent years,the fake face-changing videos generated by deepfake technology with the help of massive training data have become more and more real with the advancement of technology,and the human eye can no longer distinguish the authenticity of the information.The creation of new tools for illegal acts such as public opinion has caused serious negative impacts on the international community.For this reason,scholars at home and abroad have carried out research on deep forgery detection.At present,the methods of deep forgery detection are mainly divided into methods based on the difference between real and fake image features and methods based on deep learning.Generally speaking,the current research is only in its infancy,and it is difficult to take into account both classification accuracy and generalization performance.In this paper,for the classification of real and deep fake images,a deep fake detection algorithm based on convolutional neural network is studied.The main research work is as follows:(1)Aiming at the problem that the current deep forgery detection methods cannot take into account the classification accuracy and generalization performance and the large amount of parameters of the traditional VGG19 network,a hierarchical feature global fusion HVGG19 network is proposed.The network uses the VGG19 network as the skeleton network.The convolution modules of various scales extract high-dimensional features,and then combine the feature maps of various dimensions and scales in the convolution process to perform global average pooling dimensionality reduction to achieve the superposition of feature scales and sizes.The connection layer is discarded.HVGG19 is applied to two public deepfake video datasets,and the facial features of the datasets are annotated and compared with the state-of-the-art deepfake detectors.The experimental results show that the proposed model performs well in the detection of both datasets The accuracy rates are over 97%,and the training and testing time is reduced by about 43%compared to the traditional VGG19.(2)Aiming at the problems that HVGG19 needs to perform feature fusion on multiple convolution modules,the operation process is complex,the number of parameters is larger than that of most networks,and the generalization performance still needs to be improved,a global self-attention fusion AVGG19-LSTM network is designed.A CBAM attention module is added after each convolution module of HVGG19 to focus on the difference features in the real and fake face features.At the same time,some convolution layers in the convolution module are replaced with depthwise separable convolutions.After the convolution part of the feature is extracted,LSTM is used for feature processing.Applying AVGG19-LSTM to four public deepfake video datasets,the results show that AVGG19-LSTM has the best classification effect in the three types of datasets,and the average single image detection time is reduced compared to HVGG19 in model detection speed about 37%. |