Font Size: a A A

Design And Implementation Of Hadoop Based Public Opinion Analysis System

Posted on:2017-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:S Y XieFull Text:PDF
GTID:2416330590968410Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of economy society as well as the popularity of the Internet,more and more users' access information,express their views and opinions on the Internet.Presentation of network information is becoming more and more diverse,for example,e-mail,portals,BBS forum,blog,community,instant messaging and SNS social network.Public emotions,attitudes and opinions form a network of public opinion.Network public opinion on hot issues can lead to widespread social impact.If it has not a reasonable guide,negative public opinion is a great threat to social order and security.To enhance the detection capabilities of the network,and resolve the negative public information network has become a key issue of police department.Above work has been achieved in the traditional public opinion monitoring software.However,due to the vast amounts of information on the Internet,traditional public opinion monitoring system has become insufficient to accurately and quickly achieve real time monitoring of public opinion.An important indicator of public opinion is real time.In order to be real-time,we design and implement a police big data analysis system based on Hadoop.The distributed massive data processing performance can help police departments achieve real-time monitoring of public opinion and social stability.Our work starts from police department's work requirements,then a detailed analysis of the operating mode of police information work.We use social network analysis techniques to do Internet data mining.Main research works of this paper are as follows:1 Distributed network data crawler.A detailed account of building methods,module function,achieve methods is presented.The web crawler system is based on multiple gateways export,and can effectively solve the website for reptiles shielding improve efficiency.So it solve the problem of data source system.2 Hadoop public opinion distributed file system.As storage structure of the police public opinion system,the file system stores the collected data.Information extraction,redundancy removing,making index based on Lucence and Solr,and storing in HBase are done on thecollected data.3 MapReduce programming model.We use Mahout to do data mining on the massive data,including clustering of text analysis,heat analysis,evaluation,finding hot spots and showing the trend of public opinion.
Keywords/Search Tags:Public Opinion Analysis System, Hadoop, MapReduce, Data Mining
PDF Full Text Request
Related items