Font Size: a A A

Research On Text Classification Based On Deformable Self-attention

Posted on:2022-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YanFull Text:PDF
GTID:2518306569981009Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text classification is a popular and classic task in the field of natural language processing.The task involves tagging a piece of text with one or more predefined labels.The context information in the text is very important.Different words usually need different sizes of contextual information.The existing methods include a variety of ways to model the context information.Traditional methods use n-gram features combined with machine learning models.Today,with the rapid development of deep learning,many methods based on neural networks also model contextual information for text classification.They can be roughly divided into the following categories: models based on CNNs,models based on RNNs,and models based on self-attention.The models based on self-attention includes ordinary global self-attention models and local self-attention models.However,models based on CNNs or local self-attention usually extract fixed-scale contextual features,which cannot meet the needs for different words which need the context of different sizes.Models based on RNNs or ordinary self-attention don’t directly model the context of multiple scales.In this paper,context modeling in text classification will further be explored.The main research work of this paper mainly includes:1.A Deformable Self-Attention(DSA)network is proposed.It can adaptively learn context features of different sizes for different words in the text.Firstly,the model uses the Deformable Local Attention Weight Generation(DLAWG)module to learn local attention weights to obtain contextual features.The weights learned by the module contain different contextual information of different sizes.Then,the model further extracts local contextual features in a variety of ranges.Next,these various contextual features are used as inputs to the Multi-Range Feature Integration(MRFI)module.In this module,discriminative features and consistent features from different ranges are boosted,unimportant features are deleted,and conflict semantics are weakened.It integrates various features to obtain the features of the sentence.Finally,the features are used as inputs to the output layer for text classification.Compared with existing models on 25 datasets,the proposed model achieves the best or comparable results.Combined with the pre-training model,the effect of the model is further improved.2.A lot of analyses of the proposed DSA are conducted.Firstly,the analyses of the model variants and ablation studies of the model prove the effectiveness of each module of the model.After that,the visualization analysis compared with other self-attention models demonstrates that the model can learn contextual features of different sizes,which cannot be learned by other models.Moreover,the visualizations of multi-range features and the analyses of contextual words further illustrate that the model can extract context features of different ranges,which verifies the effectiveness of the proposed model.
Keywords/Search Tags:Text Classification, Context Modeling, Self-Attention Models, Feature Integration
PDF Full Text Request
Related items