Font Size: a A A

Aspect Category Sentiment Analysis With Self-Attention Fusion Networks

Posted on:2022-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z L HuangFull Text:PDF
GTID:2518306479493274Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Aspect category sentiment analysis(ACSA)aims to identify sentiment polarities of predefined aspect categories in a sentence.Compared with the traditional sentiment analysis of reviews,ACSA enables the company to understand details of comments more deeply and the emotional tendency of users for specific objects.However,existing methods do not pay much attention to the deep fusion of the aspect category and the corresponding sentence,which is important for the ACSA task.It is necessary for the ACSA task to identify aspect categories of a review,which is the aspect category detection(ACD)task.ACD can be defined as multi label text classification task.However,there is always the label imbalance problem in multi label text classification task.The label imbalance problem refers to the number of target labels of sample data is much less than that of nontarget labels.Most of the existing deep learning methods are based on clean labeled datasets.However,there will inevitably be a certain number of noise labels in manually labeled datasets.Noise labels mean that labels on some samples are assigned the wrong labels in supervised learning.The dataset mainly used in our paper is about multi label classification task.Most of the existing researches on noise label detection do not give multi label classification task special attention.This paper focuses on the problems and challenges of noise label detection task,aspect category detection task and aspect category sentiment analysis task.The main contributions are as follows:(1)Noise label detection algorithm,named Multi-Label Text Underfitting to Overfitting Networks(MLTU2-ONET)is proposed.MLTU2-ONET uses BERT(Bidirectional Encoder Representation from Transformers)pretraining model as the basic training model.MLTU2-ONET records the loss values of each label category of each data sample in each round during the training process.In the meantime,MLTU2-ONET standardizes these loss values.In the training process,the larger the sum of standardized loss values,the higher the probability of the noise label.The algorithm is tested on the open Chinese data set ”automotive industry users opinion theme dataset” from BDCI 2018.MLTU2-ONET achieves excellent results,which proves the effectiveness.MLTU2-ONET is applied to the ”Automobile Forum comment text dataset”,which is provided by automobile manufacturers.The noise labels of this dataset for the ACD task is corrected.(2)A multi label text classification model,named Multi-Label Circle Loss BERT(MLCLB)is proposed to solve the ACD task.MLCLB also uses the BERT pretraining model as the training model.For the label imbalance problem,MLCLB uses the form of a unified loss function for multi label classification task as the loss function.The model is tested on the modified ”Automobile Forum comment text dataset”.The experimental results strongly prove the effectiveness of MLCLB.(3)This paper focus on the deep fusion of the aspect category and the corresponding sentence to improve the performance of sentiment classification.A novel model,named SelfAttention Fusion Networks(SAFN)is proposed.First,the multihead selfattention mechanism is utilized to obtain the sentence and the aspect category attention feature representation separately.Then,the multihead attention mechanism is used again to fuse these two attention feature representations deeply.Finally,a convolutional layer is applied to extract informative features.We conduct experiments on a dataset in Chinese which is called ”Automobile Forum comment text dataset”,and a public dataset in English,Laptop-2015 from Sem Eval 2015 Task 12.The experimental results demonstrate that our model achieves higher effectiveness with substantial improvement.
Keywords/Search Tags:Noise label detection, Aspect category detection, Aspect category sentiment analysis, Multi-Label Text Underfitting to Overfitting Networks, Multi-Label Circle Loss BERT, Self-Attention Fusion Networks
PDF Full Text Request
Related items