Font Size: a A A

Analysis Of CPPCC Proposal And Related Public Opinion Based On Machine Learning

Posted on:2020-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2416330575494856Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The National Committee of the Chinese People’s Political Consultative Conference(CPPCC)is one of the most important mechanisms in China’s political system.Every year,members of the National Committee of the Chinese People’s Political Consultative Conference will submit proposals.There are 798 proposals published by the Beijing People’s Political Consultative Conference website in 2018.There is much more proposals submitted by CPPCC members across the country.Using technical method to discover hot topic of proposals and to conduct statistical analysis of public opinion,we can explore the trends related public opinion.These job can provide technical information reference for CPPCC members.At present,relevant researches on the hot topic discovery of the proposal and on the public opinion statistics of hot topics have not been seen.We design a set of CPPCC proposals and related public opinion analysis systems to provide information technology support for CPPCC members.The main work of our paper includes the following aspects:(1)We divide the topic and extract keywords of the CPPCC proposal.We realize a web crawler program and fetch the proposal data from the CPPCC proposal website;We vectorizes the CPPCC proposal according to its structural characteristics,and use the K-means clustering algorithm to group the proposals into categories,where each categorie representing a topic;We design two keyword extraction algorithms to extract keywords from each topic,which are referred to as "long words" and "short words" respectively.We design a comparison experiments to analyze the validity of the two sets of keywords.The results show that "long words" are more effective than "short words" to describe the hot topic of proposals.(2)We design and train a sentiment classification model to predict label for all unlabeled data.We develop a crawler program to fetch weibo lyric data for each "long word".We design a sentiment classification model based on Bi-directional LSTM to predict label for all unlabeled data.This classification model achieves an accuracy of 90.45%on the test set,which is much higher than the accuracy of the sentiment classification model based on the traditional machine learning algorithm on the same test set.(3)We statistic and visualize the relevant public opinion of the CPPCC proposal.On the basis of the above work,analyze the data of the acquired weibo lyrics and statistic the the related public opinion of each topic,including trend of attentional evolution,the attention level,the trend of emotional evolution and the sentiment orientation.
Keywords/Search Tags:Topic discovery, Keyword extraction, Web crawler, Sentiment classification, Public opinion analysis
PDF Full Text Request
Related items