Font Size: a A A

Research On Multi-Modal Sentiment Analysis Technologies Based On Text,Speech And Images

Posted on:2023-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:S XuFull Text:PDF
GTID:2568306836476414Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
At present,with the popularity of social networks and smart phones,people often express what they hear and feel through text,voice,video and other forms.As a branch of pattern recognition,sentiment analysis has a wide range of applications in the fields of content recommendation and public opinion analysis.Compared with the traditional single-modal sentiment analysis,which has the disadvantages of low information and poor recognition effect,multimodal sentiment analysis can use the complementarity between different modals to improve the recognition effect,so it has received widespread attention from scholars.In real life,people convey emotional information through various forms,of which words,language and expressions are the most basic way of expression.Therefore,this paper mainly studies sentiment analysis methods for multimodal data of text,speech and image,focusing on uncertainty estimation in multimodal sentiment analysis tasks and effective representation of multimodal data.The specific research content of this paper includes the following three aspects:(1)This paper introduces the current situation of multimodal sentiment analysis,analyzes the problems of single-modal sentiment analysis and the advantages of multimodal sentiment analysis.In addition,the feature extraction method of multimodal data is introduced,and the common fusion method of multimodal data is summarized.(2)Aiming at the problem of existing multimodal sentiment analysis only focuses on the accuracy of prediction and ignores prediction uncertainty,this paper uses the method of combining the Dirichlet distribution and evidence theory to model the classification distribution and uncertainty of single-modal,obtain the single-modal emotion classification result and uncertainty estimation,then use D-S evidence theory to integrate the decision layer of single-model prediction results to obtain the multimodal sentiment analysis results and the overall uncertainty estimation.Experiments show that the multimodal sentiment analysis method based on uncertainty estimation proposed in this paper has good performance,and the binary classification accuracy on the MOSI dataset reaches 83.3%.In addition,the uncertainty estimation can be obtained while predicting,which enhances the reliability and interpretability of the prediction results.(3)For the representation of multimodal data,which needs to take into account the characteristics of consistency and complementarity,this paper decouples the characterization problem of multimodality into two parts: the inter-modal task is constructed by using the feature layer fusion model based on low rank tensor to learn the consistency information of the multimodality,and the intra-modal task is constructed by using the multimodal fusion model based on uncertainty estimation proposed in this paper to learn the complementarity information of the multimodality.Finally,the model is trained through multi-task learning.Experiments show that the multimodal sentiment analysis method based on hierarchical characterization learning proposed in this paper has reached 85.07% of the binary classification accuracy of MOSI dataset and 85.78%and 81.06% of MOSEI and SIMS datasets,respectively,which have reached the current advanced algorithm level.
Keywords/Search Tags:Multimdal, Sentiment analysis, Uncertainty estimation, Representation learning
PDF Full Text Request
Related items