Font Size: a A A

Research On Depression Detection Method Based On Speech And Text Modalitie

Posted on:2024-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:J H LuFull Text:PDF
GTID:2554307097950319Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a common mental disorder,depression can affect people’s thoughts,behaviour,feelings and well-being.Depression is a disease that is prevalent worldwide nowadays and poses a great threat to human survival and development.Meanwhile,as society continues to develop and progress,people are under more and more pressure to study and work and live,and the incidence of the disease is rising with a younger trend.The total proportion of depression among teenagers and university students in China is increasing every year.With such a serious situation,it is important that we pay sufficient attention to depression.However,the traditional approach to clinical diagnosis of depression is mostly through a combination of specialist consultation and psychometric testing,which requires a high level of expertise on the part of the clinician,and this approach is constrained by subjective factors of both the clinician and the patient.In addition,the current number of specialist practitioners is not sufficient to meet the increasing number of patients.These problems lead to difficulties in the rapid diagnosis and mass detection of depressive disorders,which,if detected and identified at an early stage,could directly reduce the social and economic stress associated with depression.Based on this,it would be extremely helpful to investigate an accurate,objective and effective automatic method of detecting depression to aid in the diagnosis of depression for social and personal change.Due to the convenience and simplicity of collecting the speech data,its relatively low cost,better privacy and non-intrusive nature,it has shown greater potential for real-world applications.Therefore,this paper first conducted a speech-based unimodal depression detection followed by bringing in text to explore the approach of multimodal fusion for depression detection.The main research of this paper can be summarised in two aspects as follows:(1)The length of clinical interviews for depression can be twenty minutes or longer.Due to the existence of gradient explosion and gradient disappearance problems,the long-range dependencies in the time series cannot be well captured using traditional RNN methods.To address the above problems,this paper proposes a speech depression detection method based on the combination of Transformer model and convolutional neural network,which avoids the problem of not being able to capture long-range dependencies and helps the model to better perform the depression detection task.In addition,to address the problem of the small size of the depression corpus,a data augmentation approach is used to expand the number of samples,thus improving the performance of the system.(2)The robustness and accuracy of depression detection systems are often unsatisfactory due to the lack of information and the vulnerability of unimodal data to interference from external factors,especially when there is noise in the speech.In contrast,text data can be well complemented with audio to obtain rich and complementary feature information,therefore,text data is introduced to explore a multimodal depression detection method combining speech and text.In the first work,the Transformer model was used to capture long-range dependencies,however,the Transformer still requires high complexity in terms of temporal computation and spatial storage,with a complexity of O(n~2),thus the Transformer is still powerless in the face of longer inputs.To address the above existing problems,this paper proposes a depression detection model based on Transformer’s improved multimodal fusion.The research were experimented on the DAIC-WOZ dataset and the experimental results verified the effectiveness of the proposed method in this paper for unimodal and multimodal depression detection tasks respectively.
Keywords/Search Tags:Depression detection, Multimodal fusion, Deep Learning, Transformer, Affective Computing
PDF Full Text Request
Related items