Font Size: a A A

Multimodal Sentiment Analysis Of Visual And Textual Based On Deep Learning

Posted on:2024-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:P Z HaoFull Text:PDF
GTID:2558307127961229Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of social media,people’s life and communication methods have undergone tremendous changes,and users are increasingly keen to express their feelings and opinions in the form of pictures and texts,which makes multimodal data the fastest growing data How to effectively mine the emotional information of graphic data has become a key research topic in the field of sentiment analysis.Compared with unimodal data such as picture or text,multimodal data can contain more information and can predict the sentiment of comments more accurately.At present,the methods of graphical sentiment analysis are mainly divided into sentiment analysis methods based on feature fusion and sentiment analysis methods based on decision fusion,but these methods only consider the interactions between modalities individually or the differences between two modal data.The comprehensive consideration of correlation and interaction between two modalities of graphical text and the different contribution of different modal information to sentiment analysis and other characteristics are important for improving graphical text multimodal sentiment prediction.The main research of this paper is as follows:(1)To address the problem of differences between image and text information in online reviews,this paper proposes a graphical sentiment analysis model based on feature and decision fusion,which can fully capture the interaction information between different modalities and the unique information within a single modality.In the feature fusion stage,bilinear pooling and matrix decomposition are used for graphical feature fusion,and in the decision fusion stage,weighted average decision rules are used to assign different weights to text feature decisions,image feature decisions and graphical fusion feature decisions,and the fusion decisions are used as the final comment document sentiment prediction.(2)To address the problem that the contribution of images to sentiment analysis in online comments is lower than that of text,this paper proposes a visual attention-based graphical sentiment analysis model(BERT-VistaNet),which introduces visual information as attention to text information and is used to increase the weight value of important words or sentences in the text to obtain visual attention-based document information.For the problem that the comment images cannot all contain the comment text,the BERT model is used to perform sentiment analysis on the text to obtain a text-based document representation.Finally,to avoid the overlay of prediction errors,the visual attention-based document representation and the text-based document representation are fused in terms of features and decisions to improve the accuracy of the graphical sentiment analysis.(3)In order to verify the effectiveness of the proposed model,this paper conducts experimental validation on the Yelp website restaurant review dataset.The experimental results show that the proposed method in this paper has significant improvement over the baseline model,and each module of the algorithm has an enhanced effect on sentiment analysis,proving that the model in this paper can effectively perform sentiment analysis on graphical multimodal reviews.
Keywords/Search Tags:Sentiment Analysis, Deep Learning, Attention Mechanism, Feature Fusion, Fecision Fusion
PDF Full Text Request
Related items