Research On Topic Extraction In Online Public Opinion Based On Multi-label Classification

Posted on:2019-06-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Yang

Full Text:PDF

GTID:2427330626451950

Subject:Business Administration

Abstract/Summary:

PDF Full Text Request

With the development of Internet and the popularization of intelligent devices,the influence of online public opinion is growing.Enterprises and government agencies also pay more and more attention to the application and management of online public opinion.In the application and management of online public opinion,the first task is to extract key information from public opinion data,which is also called topic extraction.Current topic extraction methods are mainly based on probabilistic topic model,using the probability distribution between topic and term,term and text to extract text topic.However,the probabilistic topic model does not fully consider the semantic relevance between terms and topics in the text.This paper uses machine learning to extract topics in online public opinion,and defines the topic extraction problem as a multi-label classification problem of text topic(text category).In terms of similarity measurement of text data,this paper proposes a method of text semantic similarity calculation based on Baidu Baike annotation information.Firstly,the text is preprocessed by word segmentation and some other process.Then,the improved TF-IDF method is applied to calculate the weight of words in Baidu Baike entries corresponding to the terms.The entries are transformed into weight vectors of words,and the similarity between the entries is calculated by cosine similarity.Finally,text similarity is calculated by similarity matrix based on the similarity values between terms.The experimental results on Words-240 data set show that the text semantic similarity based on Baidu Baike annotation information is highly correlated with the results of manual tagging.In the multi-label classification of text data,this paper designs a multi-label classification method based on label relationship for Kernel Extreme Learning Machine.This method learns the positive and negative relationships among labels according to the co-occurrence and non-co-occurrence distribution among labels.Then label relationships are used to optimize the classification prediction results of the Kernel Extreme Learning Machine.In order to verify the validity of this method,experiments are carried out on some real-world data sets,i.e.Zhihu,Yeast,Image,Scene,Emotions,and Cal500.The experimental results show that the multi-label classification algorithm of Kernel Extreme Learning Machine based on label relationship is superior to other comparison methods in accuracy,precision,recall rate and F1 index.

Keywords/Search Tags:

Online public opinion, Topic extraction, Semantic analysis, Multi-label classification, Kernel Extreme Learning Machine

PDF Full Text Request

Related items

1	Research And Application Of Label Learning Based On Mixture Kernel Extreme Learning Machine
2	A Multi-label Learning Algorithm Combining Regression Kernel Extreme Learning Machine With Association Rules
3	Multi-domain Data Classification Based On Multi-instance Multi-label Learning
4	Multi-label Learning With Non-equilibrium Labels Completions And Its Application
5	Missing Multi-label Learning Of Imbalanced With Label Reconstruction
6	Quantifying The Evolutionary Trends Of Online Public Opinion Via Incorporating Machine Learning Strategies
7	Research Of SVM Kernel Functions In Text Classification
8	Research On Network Public Opinion Classification Algorithm Based On Machine Learning
9	Research On Online Public Opinion Based On Topic Modeling And Sentiment Analysis
10	Multi-label Public Sentiment Analysis Of Hot Topics In Social Networks