Font Size: a A A

Design And Implementation Of A Storyline Generation System For Hot Event

Posted on:2023-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:S Y LiFull Text:PDF
GTID:2568306815462344Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has greatly promoted the development of network media.Network news has become the main channel for people to understand current events and social hot spots.Xinhua,Sina News,People’s daily and other Internet media produce a large number of news reports,which are quickly spread and shared through the network.In addition to general current affairs reports or social news reports,most news reports focus on hot events with high attention.News about hot events emerge in endlessly and come from a wide range of sources.These news information is large,complex,redundant or even useless,but it contains the key details and context of hot events.Obtaining valuable information will consume a lot of time and energy of users.In recent years,the research on news storyline has attracted more and more attention.The storyline aims to analyze the development and evolution process of mining events,and show it to users in an intuitive,coherent and logical structure,so that users can quickly and easily grasp the overall development and details of events in a short time.At present,the research and application of storyline mainly focus on two problems: one is the control of event granularity,and the appropriate event granularity can give the text set a high degree of expressibility;The second is the construction method of storyline.A well readable construction method helps users intuitively understand the development of events.In this context,this research mainly studies the storyline generation scheme for news hot events,and develops a storyline generation system.The main work is as follows:(1)A scheme of storyline generation for hot events is proposed.This scheme consists of event identification module,hotspot identification module and storyline generation module.The event recognition module mainly uses the word frequency based on Pareto distribution and the weight based on news location feature distribution to extract news keywords innovatively.Secondly,we design a doublelayer clustering based on the static and dynamic combination of Louvain and Singlepass to find events,and make some improvements to Singlepass.First,we design a centroid location scheme to represent the corresponding event clusters;The second is to integrate the time attenuation function into the content similarity measurement.The hot spot identification module mainly analyzes the characteristics of hot news,and designs a hot spot identification algorithm based on word frequency growth rate,event report duration and its number to identify hot events from events.The storyline generation module mainly starts from the law of event evolution,and uses the story tree with the structure as the trunk and branch to show the general situation of event development,so as to solve the problem of unclear logic and local limitation in the current storyline.(2)Based on the storyline generation scheme,a storyline generation system is designed and implemented.The system has five modules: news collection and preprocessing module,news classification module,event identification module,hot spot identification module and storyline construction module.The system uses crawler technology to crawl the news web page data,and carries out data preprocessing,text classification,keyword extraction,event extraction,heat calculation and storyline construction for the crawler results based on the model algorithm of the relevant storyline generation process.Finally,the classification results,event generation results,hot events and storyline are visually displayed through the visual front-end to carry out all-round The multi-dimensional analysis and display enables users to easily grasp the overall development trend of news topics.
Keywords/Search Tags:Hot event, Keywords extraction, Event identification, Clustering, Storyline
PDF Full Text Request
Related items