Font Size: a A A

Event Classification And Tracking Topic Trends Based On Sequential Short Text

Posted on:2019-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:L Y HeFull Text:PDF
GTID:2428330548484512Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid development of Mobile Internet has led to the rise of social networks.The real events are propagated swiftly in the social networks,such as We Chat Twitter and Sina Microblog.Because of the randomness of the user and the word limit in the social platform,most of the social texts in the social platform are short texts.However,many sentences occur in sequences and the words do not exist independently,they interact with each other.How to make full use of these massive sequential short texts to analyse event types and extract event topic information deeply,has become a very meaningful task.Some traditional machine learning models experience a large performance degradation over sequential short texts due to limited word co-occurrence information in short texts and these models do not have the ability to handle the relationships between sequential short texts.Based on this,this paper aims to focus on event classification and tracking topic trends of sequential short texts on social networks.The major works in this paper are as followed:1.This paper designed a novel model for sequential short text classification based on a pooling attention network.This model is built on Recurrent Neural Networks(RNNs)and it can deal with sequential short texts in variable length,which is used to overcome the data sparsity in short texts.And then.A gating mechanism called Gated Recurrent Unit(GRU)is applied to the RNN to avoid the problem of training stagnation caused by the disappearance of gradient in training process.Finally,a novel mechanism,called Pooling Attention Network(PAN),is applied to the result of GRU.This mechanism is a generalisation of the pooling layer in Convolutional Neural Network(CNN)combined with an attention mechanism which Shows a good effect in machine translation.This can focus on parts of the source sentence selectively during training process to leverage the feature encoding suitably when the length of sequence is limited.2.For tracking topic trends in events,this paper put up with a effective model,called TTD(Topic Trend Detection)model.This model mainly considers data sparsity in sequential short texts by aggregation mechanism and restricting the topic distribution.And then,this model incorporates a weight pattern into topic model in order to select a few words which concurrence frequency greater than a given minimal support threshold as topic itemsets of the current document.Comparing with the term based topic representation,the topic itemsets represent the correlative words that carry more concrete and identifiable meaning.Using topic itemsets,it can extract related topic words by pre-trained word representations.Finally,this paper integrates topic itemsets with related topic words in a event which belong to different time to acquire the evolution of topic trends.In the public dataset and the real dataset,the effectiveness and practicability of PAN model and TTD model are validated through comprehensive experiments.Moreover,the experiments are carried out to further analyse how to optimize the effect of neutral network in making use of existing resources.
Keywords/Search Tags:Sequential Short Text, Event Classification, Topic Extraction, Pooling Attention
PDF Full Text Request
Related items