Research On Deepfake Video Detection Algorithm Based On Spatio-temporal Fusion

Posted on:2023-07-16

Degree:Master

Type:Thesis

Country:China

Candidate:Z B Wang

Full Text:PDF

GTID:2558306845499014

Subject:Signal and Information Processing

Abstract/Summary:

Digital video is widely used in news media,forensic identification,and other fields.However,with the development of information technology,more and more powerful digital video editing technologies have been developed and used,and more users can freely edit videos.The processing and modification of the video gives some malicious users an opportunity,and it is difficult to guarantee the authenticity and integrity of the video.At present,the widely used deepfake technology can create fake videos by exchanging the faces of different people,making them almost indistinguishable by human eyes,posing a serious threat to information security.Therefore,this paper studies the Deepfake video detection algorithm based on deep learning technology to reveal whether the video has been tampered with Deepfake technology and verify the authenticity of the video data.The main work includes:(1)A deepfake video detection algorithm based on spatiotemporal features is proposed.The algorithm designs a temporal feature extraction module and a spatial feature extraction module.The temporal feature extraction module can capture the discontinuity between deepfake video frames,and the spatial feature extraction module can extract the forgery traces in the spatial domain.Finally,a corresponding fusion module is designed to mine the complementary advantages implied by the two-way features.The experimental results show that,compared with the mainstream algorithms based on spatial features,the accuracy of the proposed model on the Celeb-DF and Face Forensics++ datasets is increased by 1.07% and 3.13%,respectively.(2)A deepfake video detection algorithm based on spatiotemporal attention is proposed.The algorithm proposes a feature extraction module and an attention-guided long-short-term-memory module to extract more effective spatiotemporal features.Firstly,the feature extraction module will extract high-level semantic features from the fullyconnected layers of the backbone network and spatial features from the mid-level convolutional layers of the backbone network,respectively,and then feed the extracted feature maps into the attention guided LSTM module to learn spatio-temporal information.The attention guided LSTM modules include a temporal attention module and a spatiotemporal attention module,which aim to focus on key artifact information in videos.The experimental results show that,compared with the popular deepfake detection algorithm,the accuracy of the proposed model on the Celeb-DF and Face Forensics++datasets is increased by 1.33% and 1.89%,respectively.(3)A deepfake video detection algorithm based on cross-modal spatiotemporal fusion is proposed.The algorithm uses a spatial convolutional neural network as the backbone network to extract visual features,and an audio network is designed to extract audio features,which is used as an attention flow to guide the network for visual modeling in the spatial dimension.In addition,an audio and video interaction module is designed to ensure the fusion of audio and video features.The experimental results show that,compared with the current advanced deepfake detection algorithm,the accuracy of the proposed model on the Fakeavceleb and DFDC datasets is increased by 3.87% and 2.96%,respectively,which further verifies that the effectiveness of proposed model is used in the Deepfake video detection task.

Keywords/Search Tags:

Deepfake video detection, spatiotemporal features, audio features, cross-model fusion

Related items

1	Audio Deepfake Detection Based On Frequency Domain Features And End-to-end Models
2	Research On Visual Salient Object Detection Via Graph Fusion Of Spatiotemporal Features
3	Research On Visual Salient Object Detection Combined With Spatiotemporal Features
4	Deepfake Video Detection Based On Spatial-temporal Features
5	Multi-modal Deepfake Detection Of Specific Individuals Based On Audio-visual Features
6	Research And Implementation On Video Copy Detection Based On SIFT Features
7	Research On Fire Image Detection Technology Based On Flame Features
8	Research And Implementation Of Deepfake Audio Detection Technology
9	Exposing DeepFake Videos Based On Interframe Features
10	Research Of Vehicle Detection Method Based On Features Fusion