Font Size: a A A

Design And Implementation Of Internet Public Opinion Monitoring And Processing Platform Based On Hadoop

Posted on:2020-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:C CuiFull Text:PDF
GTID:2428330590979215Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of technology,smartphones and networks are fully popularized,people's communication is no longer limited by distance and time,massive information is disseminated to various social platforms on the Internet in a faster and broader way,the influence of network public opinion on the real society is more and more great.Traditional public opinion monitoring system can collect enterprise information on the Internet and simply display the public opinion of the enterprise,but in massive information collection and analysis needs to be improved,based on this,this thesis develops an Internet public opinion monitoring and processing system based on Hadoop.This system can quickly collect the information of enterprise public opinion on the Internet,and analyze the collected public opinion information,show the latest trend and development direction of enterprise public opinion,provide data support for the processing before and after the diffusion of negative public opinion,reduce the loss of enterprises.The main research work in this thesis is as follows.Based on the analysis of the existing Internet monitoring software application and related technologies at home and abroad,according to the actual needs of enterprise public opinion monitoring,the overall framework of Internet public opinion monitoring and processing platform is designed,the framework is divided into four functional modules: information collection,information analysis,information display and system management;Detailed design of the system is carried out around four functional modules of the public opinion monitoring and processing platform,B/S architecture,Hadoop,distributed Nutch crawler,k-means clustering algorithm and other technologies are used in the design process;Finally,the system development environment is built to complete the deployment of public opinion monitoring and processing platform,at the same time,the system collection function,retrieval function,sentiment analysis function,etc.are tested one by one.In the design process of public opinion monitoring and processing platform,the information collection module uses distributed Nutch crawler and adds topic correlation judgment plug-in to realize massive information collection for specific topics;The information analysis module uses k-means clustering algorithm based on high-frequency words to realize clustering analysis of hot topics,the problem of unstable clustering result and system running time caused by random selection of initial cluster center is solved,and the clustering accuracy and speed of hot topics are improved;The actual test shows that the system can collect and analyze the network public opinion information of the target enterprise in time,help the enterprise to grasp the social public opinion dynamics in advance,and provide data support for the enterprise public opinion processing.
Keywords/Search Tags:Internet public opinion, Hadoop, Nutch, Clustering algorithm
PDF Full Text Request
Related items