Font Size: a A A

The Research And Implementation Of Extracting Important People And Event Evolution Based On Internet News

Posted on:2019-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:C Z WuFull Text:PDF
GTID:2348330542498761Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the continuous acceleration of Internet,reading news on Internet become one of the mayor ways to acquire news.News on Internet is different from that in traditional media,it is timelier to attract readers to understand what happened now.While due to everyone could write articles on Internet,the quality of Internet content is not always satisfying.It is hard for people to rapidly understand who is important,which event is important and the whole development of those events in a news theme.At this circumstance,researching on extracting people and events and detecting evolutions among those events,presenting the result to readers to help them understand the news they are interested in is important.This article is going to focus on text modelling,event detecting and event evolution extracting through relative technologies in order to extract important people,events and events evolution effectively and efficiently.The main works in this paper are listed in the following:1.Collect Internet news and process it through tokernizer.The Tokernizer help to separate articles into words and recognize entities.This paper construct the people co-occurrence network based on the co-occurrence relations among words.This paper improves the effectiveness of TOPSIS algorithm by Euclidean distance between words.The new algorithm is used to evaluate the importance of people based on centralities of complex network.2.Proposes a novel model in text modelling based on word embedding trained by Word2Vec model,and improves its efficiency through parallelization computing model.Then extracting news events based on clustering algorithm.3.Model events through feature words and propose a novel model based on time of events,random work model and cosine similarity to calculate the relations of events evolution.The result of this algorithm is presented as event evolution graph.Part of this model is executed on parallelization computing model to improve the efficiency.Meanwhile,dividing event evolution graph into different evolution stages through Louvain algorithm,and extracting news trends from those stages.
Keywords/Search Tags:text modelling, event evolution, co-occurrence network, random walk model, parallel computing
PDF Full Text Request
Related items