| Emotion recognition in conversation(ERC)task aims to identify the potential emotion of each utterance in the dialogue.As a new research direction of natural language processing(NLP),ERC gained attention from researchers due to its applications in many fields,such as opinion mining in social media,medical care system as a psychological analysis tool and dialogue robot with emotional awareness.Therefore,ERC task has important research significance.In the ERC task,identifying the emotion of each utterance in the dialogue may be affected by many factors.At present,there are several key research challenges that make the task of ERC difficult to solve: 1)when obtaining the utterance vector representation,the current model only focuses on the context information in the utterance and ignores the global context information of the dialogue;2)the content of context information in the dialogue is complex,and the effective information extracted by the machine is limited,so the context modeling work is difficult.In view of the above challenges,the main work of this paper is as follows:(1)When extracting the utterance vector,the existing ERC task research only focuses on the context information in the utterance.This method leads to the semantic gap between the utterance vector and the subsequent context modeling work,which increases the difficulty of the model to understand the dialogue content.The main reason is that the context modeling is sensitive to the global context of the dialogue,but the utterance vector unable to reach the requirement.In this paper,we propose a global context sensitive bi-layer model for extracting utterance vectors.Specifically,the lowlayer model is Bert(bidirectional encoder representation from transformers)model,which is used to extract context independence utterance vectors.The high-layer model is Transformer model,which integrates the semantic information conveyed by each utterance in the dialogue through the attention mechanism,so that all utterance vectors in the dialogue contain the global context information of the dialogue.(2)The existing research on ERC tasks relies on the introduction of external common sense knowledge to assist the context modeling,which provides the model with external knowledge that can not be captured from the dialogue,and has achieved competitive performance.However,the information provided by common sense knowledge is still limited,and it is difficult for the model to quickly and effectively obtain useful knowledge for emotion prediction from a large amount of external knowledge.When people analyze the dialogue from the perspective of a third party in real life,they will make corresponding reasoning according to the dialogue content and common sense.These inferential common sense knowledge can help people understand the dialogue effectively.In this paper,we propose a graph attention network model based on inferential common sense injection for the context modeling of ERC tasks.Specifically,the inferential common sense for seven inferential relation is generated by Commonsense Transformer(COMET),and inject them into context modeling so that they can help model to predict emotion of utterance in dialog.In addition,the graph attention network model has the characteristics of social network structure,which is suitable for simulating the interactive relationship in the form of dialogue.Its strong information aggregation ability makes it play an important role in capturing the dialogue context information.Therefore,this paper selects graph attention network as the basic model of context information modeling.Finally,this paper organically combines the above two models to form a conversational emotion recognition model to solve ERC tasks,and we achieves competitive results on two public conversation datasets which proved the effectiveness of the model.On MELD dataset,F1 score reached 64.8%,which is competitive compared with the existing ERC model,while on Emory NLP dataset,F1 score reached39.4%,which is ahead of the existing ERC model. |