Research On The Generalization Ability Of Pre-training Model Of Multi-task Learning Combination

Posted on:2022-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Lv

Full Text:PDF

GTID:2518306773990499

Subject:Publishing

Abstract/Summary:

PDF Full Text Request

Since 2017,Google Brain proposed the Transformer model innovatively,a large number of pre-trained language models have sprung up in the next two years.Such as Bert,Roberta and so on,which are all based on the Transformer.Although through fine-tuning,the pre-trained model can get better results on different datasets,it has limited generalization ability.If the parameters keep constant,we can not guarantee that the model can fit other datasets well,its basic universality is poor.When aiming at improving the generalization ability of the pre-training model,we choose Bert as our pretraining model to experiment.At the same time,we introduce a multi-task learning combination algorithm in the fine-tuning stage.At the data level,we use the easy data augmentation（EDA）to explore the sample set.At the algorithm level.Firstly,we optimize the downstream network structure,converting from FC to LSTM.Secondly,we put forward a new multitask parameter sharing structure,the pre training model is used as parameter sharing layer.Finally,we optimize the loss function of downstream task and sum the weighed loss generated by each datasets,performing gradient back propagation and updating the weight parameters.In the end,the multitask learning combination method has a significant improvement on the public dataset compared with the traditional methods.

Keywords/Search Tags:

Pre train, Generality, Shared parameter, Multi-task, Bert

PDF Full Text Request

Related items

1	Research On Task Allocation Algorithm With Energy-efficient Heterogeneous Multi-core Under Shared Resource Constraints
2	Parameter Identification And Its Application In Automatic Train Operation Control
3	Multi-Task Based Position-aware Study Of Emotion-cause Pair Extraction
4	Swarm Robot Multi-objective Search Task Allocation And Parameter Selection Research
5	Research On Detection And Recognization Of Network Offensive Speech Based On Multi-task Learning
6	The CTCS-3 Simulation And Testing Platform-Research On Multi-Train Simulation Subsystem
7	Study And Implementation Of A Multi-Task Management Mechanism In Application Servers Based On The C/S Architecture
8	Research On Intent Recognition And Semantic Slot Extraction Algorithms Based On BERT
9	Design And Implementation Of Netease Train Ticket Booking System
10	Research On Intent Recognition Based On Task-based Multi-round Dialogue