| Image sentiment analysis aims to mine the emotional semantics contained in image information,and has a wide range of applications in business intelligence,entertainment assistance,opinion prediction and other fields.Due to the diversity and abstraction of image content,image sentiment analysis faces two challenges: first,image sentiment is the overall expression of the context of each visual element in the image,how to model the relationship between image features is the key to sentiment analysis,but existing methods do not adequately capture the emotional associations between features.Secondly,the emotional expression of some sample images is too vague,and the text description associated with the image is a supplement to the semantic information of the image,and the combination of the two can express more explicit emotional information.For the above problems,this paper takes advantage of the advantages of graph convolutional network in semantic association expression,first studies the hierarchical graph convolution model of image sentiment analysis,and further studies the image joint text sentiment analysis model based on semantically enhanced graph convolution.Specific work includes:(1)Hierarchical Graph Convolutional Image Sentiment Analysis Classification Model:Current sentiment analysis methods based on deep networks mainly learn the deep features of images automatically through convolutional neural networks.However,image emotion is a comprehensive reflection of the global contextual features of the image.Due to the limitation of the receptive field size of the convolution kernel,it is impossible to effectively capture the dependencies between long-distance emotional features.At the same time,the emotional features of different levels in the network cannot be effectively fused and utilized.It affects the accuracy of image sentiment analysis.In order to solve the above problems,this paper proposes a hierarchical graph convolutional network model,which constructs a spatial context graph convolution(SCGCN)module and a dynamic fusion graph convolution(DFGCN)module in the spatial and channel dimensions,respectively,to effectively learn the emotional features of different layers.The global context association and the relationship dependence between different levels of features improve the accuracy of sentiment classification.Experimental results on four emotion datasets show that the proposed method outperforms existing image emotion classification models in both emotion polarity classification and fine-grained emotion classification.(2)Sentiment analysis classification model for image joint text: Compared with single-modal image data,multi-modal data in the form of graphics and text can provide richer features to help the model analyze the sentiment in the data.Most of the previous work used feature splicing strategies based on gating and attention mechanisms to fuse graphic and text data,and did not fully explore the contextual associations within and between modal data.To this end,this paper proposes an image-text joint sentiment classification model based on semantically enhanced graph convolution and label comparison learning,which uses the emotional similarity between nodes for intra-modal feature enhancement and inter-modal feature interaction.In addition,this paper also proposes a label-based contrastive learning loss function to help the model learn similar features related to emotion from multimodal data.The comparison and ablation experiments on the image-text emotion dataset show the effectiveness of the algorithm proposed in this paper in emotion classification. |