Font Size: a A A

Research On The Model And Method Of Text Classification For Extremely Long Text In Multiple Scenarios

Posted on:2024-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:J XiongFull Text:PDF
GTID:2568307079460394Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In practical application,text classification task face with the problem of multiple scenarios and the existence of extremely long text in the dataset.On the one hand,the scenario of text classification task is very complex,and it may be full resource,low resource,or even few-shot scenario.Fine-tuning is difficult to meet the requirements of low-resource or few-shot scenario.However,prompt learning can make use of the knowledge from the pre-trained language model and complete the text classification task with few training samples.The prompt learning transforms the text classification task into the label word prediction task.However,the pre-trained language models may suffer from the problem that the vocabulary does not contain the label words,and never predicts the label words in the pre-trained task.On the other hand,extremely long text may exist in the dataset of text classification tasks,and extremely long text can be tokenized into several thousand tokens,which is difficult for pre-trained language model to process.Moreover,positional encoding or embedding existing in pre-trained Transformer model limits the maximum number of tokens that model can handle.Based on the above two problems,the main research content of this thesis is as follows:(1)In order to solve the problem that Transformer model cannot handle extremely long text,this thesis designs a positional dependent module called Pos Attention,and builds a novel text classification model Posformer based on Pos Attention,which can handle arbitrary length of text.In this model,a convolution layer is introduced into the attention layer to maintain the position information of different tokens in the text.As a result,Posformer no longer requires positional encoding or positional embedding,so it can handle text of arbitrary length,making it more suitable for text classification tasks that contain extremely long text.The Posformer proposed in this thesis approximates or even surpasses the pre-trained language model Bert-Base on the five baseline datasets for text classification tasks.⑵ In order to solve the problem that the pre-trained language model never predicts label words in the pre-training task,this thesis designs a new meta-learning sampling strategy and proposes an online learnable text classification method Meta-Prompt based on this strategy.In this method,meta-learning is introduced in the training process,and the knowledge of predicting the original word of ’ [MASK] ’ learned from the upstream mask prediction task is transferred to the label word prediction task of the downstream cloze prompt learning task.Moreover,the sampling strategy of Meta-Prompt enables the text classification model to have online learning capability,that is,learning from a sample to be predicted,and then using the learned model parameters to predict the class of the sample.Meta-Prompt is also better suited to different real-world scenarios,as it works well in full-resource,low-resource and few-shot scenarios.⑶ This thesis uses the Posformer model and Meta-Prompt text classification method to implement a public opinion analysis for extremely long text in multiple scenarios.
Keywords/Search Tags:Text Classification, Attention Mechanism, Meta Learning, Cloze Prompt Learning, Public Opinion Analysis
PDF Full Text Request
Related items