Depression Recognition Using Audio And Video

Posted on:2024-03-09

Degree:Master

Type:Thesis

Country:China

Candidate:X N Zhang

Full Text:PDF

GTID:2544307115997969

Subject:Electronic Information (Computer Technology) (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Depression has been defined as a serious disease by the World Health Organization,and the number of diagnosed people is showing a large increase and a trend of gradually decreasing age,which has caused serious impacts worldwide and increased the social and medical burden.Recently,based on machine learning and deep learning methods,a lot of research has been done on multimodal depression recognition technology using feature types such as video and audio.Multimodal depression recognition is a multidisciplinary research that integrates psychology,medicine,and computer science cross-cutting research topics.In recent years,many advanced deep learning models have been widely used in the field of automatic depression recognition.Long short-term memory neural networks represented by LSTM(Long short-term memory)are used to analyze sequence data,and CNN(Convolutional Neural Network)Represented by convolutional neural networks for analyzing image and video data.Due to Transformer’s excellent contextual dynamic information modeling capabilities,the self-attention mechanism has also attracted extensive interest and applications.This paper conducts research on automatic depression identification technology,and the research contents are as follows.(1)For single-modal audio depression recognition,in order to obtain the local and global temporal context features of the audio,this paper proposes an audio depression recognition method based on Transformer and LSTM.First,low-level raw features of depression audio clips are extracted from the dataset videos.Then,the method uses Transformer to extract global high-level audio temporal features of audio,and LSTM to extract local high-level audio temporal features of audio features.The local and global features are fused using model layer fusion.Finally,depression assessment is performed using a fully connected layer.Experiments show that the method shows good performance on audio automatic depression assessment.(2)For single-modal video depression recognition,in order to obtain video global differential attention features,this paper proposes a video depression recognition model based on differential convolution and self-attention mechanism.First,16 frames are extracted from the dataset video as the frame-level raw input of the model.Then,use the differential convolutional layer to extract the deep differential spatio-temporal features of the video,and use the self-attention mechanism to give greater weight to the effective features and obtain the final global differential attention features.Finally,a fully connected layer is used for depression level estimation.Experiments show that the method shows good performance on automatic depression assessment from videos.(3)In order to combine audio and video to obtain audio global attention features and video multi-scale differential attention features,this paper proposes a multimodal depression recognition model using differential convolution and Transformer.The method first extracts the raw data of audio and video from the dataset,using an end-to-end training method.Using differential convolution to extract video features,the differential method can reduce the uncertainty caused by inconsistent frame steps,and3 D convolution can extract deep spatiotemporal features of video data.Use Transformer to extract audio features,and use Transformer’s global context modeling capability to model audio data.Finally,the attention mechanism is used to fuse the audio features and video features.Experiments show that the method shows good performance on audio-video multimodal automatic depression assessment.

Keywords/Search Tags:

deep learning, difference method, attention mechanism, convolutional neural networks, depression detection, multimodal

PDF Full Text Request

Related items

1	Depression Diagnosis Algorithm Based On Deep Learning
2	Deep Learning Diagnosis Method Of Muscle Atrophy Based On Ultrasound Multimodal Video
3	Evaluation Of Efficiency And Accuracy Of Lung Nodule Detection Models Based On Deep Convolutional Networks
4	A Subclinical Difference Level Metabolomics Data Processing Method Based On Deep Neural Network Algorithm
5	Classification Of ST Segment In ECG Signals Based On Deep Learning Approach
6	Research On Seizure Prediction Combined With Attention Mechanism And Deep Learning
7	Research On Deep Convolutional Networks Applied To Lumen And Media-Adventitia Border Detection In Intravascular Ultrasound Images
8	Lung Nodules Detection On Medical Images Based On Deep Convolutional Neural Networks
9	Pulmonary Nodule Detection Method Based On Deep Learning
10	Research On Nuclei Segmentation Algorithm Of Digital Pathological Image Based On Deep Learning