| Due to its direct destructive and significant impact on human society and the environment,the people have great interest in the emergency events.And in recent decades,there have been frequent outbreaks,especially in this age of epidemic,and sudden public health incidents have received much attention.In today’s mobile Internet age,it is helpful for the public to obtain information on emergencies and carry out rescue operations.At the same time,however,the information on the Internet is so mixed that it is difficult for the public to easily,comprehensively and systematically understand an event,especially if the event’s impact lasts longer.Therefore,it is of practical significance to extract important information from a large and disorderly volume of event texts and present the development of events concisely in a logical order of time.The automatic summary technology of extracting the most important information from the given text is an important research direction to achieve this goal.Therefore,by studying the automatic summary technology,this paper designs the SCC-UNILM generation model and the construction method of the unexpected time evolution sequence,and completes the design and implementation of text information extraction system.The main contents of this paper are as follows:1.This paper analyzes the problems existing in the current summarization algorithm.In view of these problems and the purpose of this paper,the UNILM model is selected,and on this basis,the copy mechanism is integrated,the sparse softmax is replaced by the traditional softmax,and the coverage loss is added to the overall loss function.Based on the improvement of the UNILM generative algorithm,the SCC-UNILM generative model is proposed.And using the ROUGE evaluation value as the evaluation index,it is verified on the LCSTS dataset that the improved SCC-UNILM model outperforms other benchmark models.2.This paper mainly uses the improved SCC-UNILM algorithm,the time series generation algorithm and the algorithm to eliminate redundant information to design the construction method of the time evolution sequence of the emergency event.It is mainly composed of three modules: text preprocessing module,text clustering and deduplication module,and emergency event time evolution sequence generation module.The emergent event time evolution sequence generation module is the core module.In order to retain as much useful information as possible,the text is summarized in natural paragraph units.And because of the need to construct a time series,it is necessary to ensure that each paragraph has a time stamp.Therefore,a timing generation algorithm is designed in this paper.However,while retaining as much useful information as possible,it also brings redundant duplicate information.Therefore,this paper also proposes a method of sentence similarity calculation combined with semantic role annotation to remove redundant information.Finally,the construction of a complete event time evolution sequence is organized according to the structure of event background,event process,and event impact.3.This paper designs a text information extraction system.The system includes text preprocessing module,text information extraction module and event time evolution sequence generation module.It mainly realizes the functions of word segmentation &part-of-speech analysis,referential resolution,semantic role tagging,keyword extraction,text summary generation and event time evolution sequence generation. |