Font Size: a A A

Research On The Generalization Ability Of Pre-training Model Of Multi-task Learning Combination

Posted on:2022-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LvFull Text:PDF
GTID:2518306773990499Subject:Publishing
Abstract/Summary:PDF Full Text Request
Since 2017,Google Brain proposed the Transformer model innovatively,a large number of pre-trained language models have sprung up in the next two years.Such as Bert,Roberta and so on,which are all based on the Transformer.Although through fine-tuning,the pre-trained model can get better results on different datasets,it has limited generalization ability.If the parameters keep constant,we can not guarantee that the model can fit other datasets well,its basic universality is poor.When aiming at improving the generalization ability of the pre-training model,we choose Bert as our pretraining model to experiment.At the same time,we introduce a multi-task learning combination algorithm in the fine-tuning stage.At the data level,we use the easy data augmentation(EDA)to explore the sample set.At the algorithm level.Firstly,we optimize the downstream network structure,converting from FC to LSTM.Secondly,we put forward a new multitask parameter sharing structure,the pre training model is used as parameter sharing layer.Finally,we optimize the loss function of downstream task and sum the weighed loss generated by each datasets,performing gradient back propagation and updating the weight parameters.In the end,the multitask learning combination method has a significant improvement on the public dataset compared with the traditional methods.
Keywords/Search Tags:Pre train, Generality, Shared parameter, Multi-task, Bert
PDF Full Text Request
Related items