| Since 2017,Google Brain proposed the Transformer model innovatively,a large number of pre-trained language models have sprung up in the next two years.Such as Bert,Roberta and so on,which are all based on the Transformer.Although through fine-tuning,the pre-trained model can get better results on different datasets,it has limited generalization ability.If the parameters keep constant,we can not guarantee that the model can fit other datasets well,its basic universality is poor.When aiming at improving the generalization ability of the pre-training model,we choose Bert as our pretraining model to experiment.At the same time,we introduce a multi-task learning combination algorithm in the fine-tuning stage.At the data level,we use the easy data augmentation(EDA)to explore the sample set.At the algorithm level.Firstly,we optimize the downstream network structure,converting from FC to LSTM.Secondly,we put forward a new multitask parameter sharing structure,the pre training model is used as parameter sharing layer.Finally,we optimize the loss function of downstream task and sum the weighed loss generated by each datasets,performing gradient back propagation and updating the weight parameters.In the end,the multitask learning combination method has a significant improvement on the public dataset compared with the traditional methods. |