Font Size: a A A

Research On Government Document Abstract Algorithm Based On Deep Learning

Posted on:2021-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:W Z WangFull Text:PDF
GTID:2416330623467817Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As emerging information applications such as Smart City and E-government draw national attention,the e-government leading group of China emphasizes that governments of all levels should use innovative methods to optimize public services and improve their work efficiencies.Since document processing acts as one of the main administrative job of government affairs,the traditional document processing requires people to read through the documents,extract important information,and then perform the final implementation or convey.The traditional processing method relies on manual processing,and often results in some slow processing procedures with lots of human resources.Not every one of them needs to be read carefully,many government documents only need to be roughly understood as some main information points that can be carried out or conveyed.Automatic Text Summarization(ATS)in Natural Language Processing(NLP)focuses on compressing and summarizing the information of some appointed documents,and provides a short and comprehensive summary of them.The adoption of ATS techniques in government document processing can effectively alleviate the problem of manual document processing,and improve work efficiency by saving labor in governments.This thesis is based on the laboratory-based government document intelligent summary project.The main work of this thesis is to develop a proper ATS algorithm based on deep learning for government document processing,and through a mixture of various means,the scheme can achieve the acceptance index of the project.Our main contributions are listed as follows:1.Propose some certain modifications on methods of graph sorting algorithms and ATS algorithms of supervised extractive algorithms respectively.We first investigate some extractive ATS algorithms,and reproduce the methods of graph sorting algorithms and those of supervised extractive algorithms on a government document dataset.Through the analysis of the experimental results,we find the drawbacks of these algorithms according to their performances on the government document dataset,and propose some certain modifications on the two methods respectively.2.Propose some certain modifications on abstractive ATS algorithms.Firstly,we test some popular abstractive ATS algorithms including the ABS model,RAS model,CopyNet,and ML+RL model,and compare their performances on the government document dataset.Based on the analysis of the experimental results,the CopyNet model is determined as the baseline model.We improve this model and achieves nice performance boost on the government document dataset.3.Introduce the idea of combine extractive summarization and generative summarization algorithms.We explore the combination frameworks,and develop two different frameworks with different focuses and combination types,as the cascaded framework based on threshold regulation and the direct framework based on reinforcement learning.According to the final experimental tests,the two frameworks both reach the predefined acceptance criteria of government document processing.
Keywords/Search Tags:Government Document, Deep Learning, Text Summarization, Fusion method
PDF Full Text Request
Related items