| With the continuous development of Internet technology,the society has entered the era of big data,in which the amount of text data has increased exponentially.It is difficult for people to quickly identify the information that meets their needs from the massive text messages.Keyword are highly condensed words of text topic information,which can help people understand the core content of the text very quickly.At the same time,keyword can also be applied to natural language processing tasks such as text classification,document retrieval,automatic summarization,and recommendation systems.Therefore,the technology of keyword extraction becomes particularly important.Nevertheless,the traditional keyword extraction model has two shortcomings:1)most of the models can only extract keyword that appear in the original text;2)the models mainly rely on shallow features of text to extract important words,so it is difficult to mine and make full use of the latent semantic information of the text.In recent years,the neural network-based keyword generation model can better overcome the limitations of the above extraction model,but the keyword obtained by the existing keyword generation model still have the problem of deviating from the original content.In order to alleviate the above problems,we combine the keyword extraction model and keyword generation model to quickly focus on the core content of the original text.On the other hand,we try a variety of fusion methods to improve the quality of the keyword that are generated by the models.First of all,when generating keyword of the text,people usually extract important information from the text,and then generate keyword based on the understanding of the important information.Based on this,we propose to extract words and sentences with important information from the text by TextRank algorithm.After that,we combine the extracted important information with the deep learning model.We propose two fusion schemes,the first scheme is to integrate the extracted important information into the attention mechanism.the other scheme is to encode the extracted information to obtain the context representation of important information.The obtained important information context representation and the original text context representation jointly determine the output of the decoder.The experimental results on the academic paper data set KP20K verify the effectiveness of the proposed two fusion models.In addition,the experimental results on the news data set DUC-2001 verify that the first fusion method has better domain adaptability.Secondly,most of the existing generative models are limited to considering the content of the text itself.Moreover,the existing models seldom make full use of the important sentences and phrases in the text to guide the generation of keyword.In view of this,we propose a keyword generation model based on multi-granular important information guidance.In this model,the extracted phrases and sentences are used as additional input for multi granularity coding,and then the context vectors which can reflect the important information of the text are obtained through the attention matching mechanism.Finally,they are integrated into the sequence coding layer together with the original text coding vector,thereby strengthening the model's ability to summarize the important information.The experimental results on the KP20K dataset verify the feasibility and effectiveness of the model. |