Font Size: a A A

Research On Text Summarization Based On Knowledge Distillation

Posted on:2023-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:T DongFull Text:PDF
GTID:2568306827475494Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Text summarization technology can convert a long document into a short summary quickly and effectively,which can not only retain the main content and idea in the long document,but also ensure that the content of the summary is not redundant or repetitive.At present,text summarization technology has been widely used in many industries such as news and information retrieval,so that people can obtain the key information of long texts in their daily life efficiently.However,there are still many problems to be overcome in the text summarization task.For example,summary may generate several words repeatedly,and the out-of-vocabulary words that do not appear in the vocabulary cannot be generated in the summary.In addition,the scale of the existing deep learning model is increasing,which makes it difficult to deploy the text summarization model.To solve the above problems,this paper proposes a text summarization model based on knowledge distillation and pointer-generating network,so that the student model can learn the summary generation ability of the pre-trained teacher model through the knowledge distillation method.The model uses the multi-head attention mechanism and the coverage mechanism to balance the weight of words and avoid repetition.At the same time,the model has the ability to copy words from the input by the copy mechanism to solve the problem of the out-of-vocabulary words.In order to improve the summarization generation ability further,on the basis of the previous research,this paper proposes a text summarization model based on knowledge distillation and Transformer guided by similarity.On the basis of retaining the knowledge distillation method,the weight of word is balanced by applying the self-attention mechanism in Transformer.The paper designs a novel copy mechanism that can take the previous generated word into account to solve the problem of out-of-vocabulary words better.In addition,a loss function based on similarity is proposed to guide the model to generate a summary closer to the content of the input document.The experimental results on Gigaword English text summarization dataset and Weibo Chinese text summarization dataset show that the two text summarization models proposed in the paper outperform the baseline models in terms of the Rouge evaluation metric,which proves the effectiveness of the model in text summarization task.In addition,the paper also discusses the time comparison between student model and teacher model in training stage,and shows the examples of text summarization,which shows the practical application value of the model proposed in the paper further.
Keywords/Search Tags:Knowledge Distillation, Text Summarization, Natural Language Processing, Attention Mechanism, Copy Mechanism
PDF Full Text Request
Related items