Font Size: a A A

Research On Text Generation And Application Based On Pointer Generation Network

Posted on:2022-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:S QinFull Text:PDF
GTID:2518306746483084Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advent of the Internet era,access to information is no longer a difficult task,and people are in a state of information explosion every moment,with news breaking,celebrity news,political turmoil and economic changes,and other information.Under the flood of these information,it is an urgent problem to quickly distinguish what is needed and what is not needed.Text summary generation technology can be a good solution to this problem,as it can effectively bring together important information in long text messages to create a summary text for people to read,so as to better distinguish the merits of the information.In this paper,we improve the existing model pointer generation network model to obtain a better representation of word embedding in Chinese summary and a good improvement in the feature representation of text.The research conducted in this paper is as follows.First,since the pointer generation network(PGN)has the problems of inaccurate Chinese semantic representation,insufficient feature extraction and missing Chinese lexical information,this paper improves the word embedding layer by adding ERNIE pre-training model to enhance the model’s semantic representation of Chinese,which specifically models the problem of missing Chinese lexical information and greatly enhances the model’s understanding of Chinese semantics,making the semantic representation of Chinese more accurate.Chinese semantic representation is more accurate.Then,to address the problem of inadequate extraction of text features in the generation model for long text documents,this paper adds a convolutional neural network(CNN)for feature extraction of long text,and further feature optimization in the CNN after pre-training,and then combines the advantages of the pointer generation network in dealing with Out Of Vocabulary(OOV)and repetition problems,and then improves the quality of text summaries.A new text summarization generation model,the ECPGN model,is constructed.Again,comparative experiments are conducted for this model,and the evaluation metrics used for the experiments are using loss values and Rouge scores,and experiments are conducted on the unified dataset NLPCC2017,using the same experimental parameters and experimental equipment for the traditional methods Text Rank,Seq2 Seq,Seq2Seq with attention,PGN,and BERT+PGN for comparison experiments,five sets of comparison experiments were set up and the summary generation effect was better than other models under the joint determination of loss value and Rouge evaluation metrics,indicating that the improved model is better and the summary generated by this model is of higher quality.Since the growth of the evaluation index of this model is not obvious enough,this model needs to continue to be improved in future work.Finally,based on this paper model for system design,from the needs of judicial workers,study the problems that need to be solved;the proposed problems and needs to do the detailed design of the system,designed a number of modules;finally the designed modules will be implemented and tested on real text data for this system,after testing the text summarization system passed the test,the successful implementation of this system design.
Keywords/Search Tags:Pointer Generation Network, Text Summarization Generation System, ERNIE, CNN, Deep Learning
PDF Full Text Request
Related items