Font Size: a A A

Research And Implementation Of College Public Opinion Analysis System Based On Distributed Platform

Posted on:2018-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:C KongFull Text:PDF
GTID:2348330512489814Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and network infrastructure,network information technology has been applied to every field of the society.the Internet has become a huge public information platform for people.The behavior of publishing comments and opinions through different social media,like Weibo,Wechat,Websites or BBS,has become more common for students and teachers in universities.However,some of the comments or opinions are negative.The spread of these negative information can bring great instability to society.The use of the university network public opinion monitoring system can carry on the effective supervisory control on the teachers’ and students’ comments and opinions,and prevent the spread of false information,which has a significant and realistic meaning on maintaining the stability of the whole society.At present,most of public opinion monitoring systems use the relational database as the data storage platforms or use stand-alone resources for data processing.When facing the massive demand for data storage,such systems are clearly not realistic.It can not achieve the desired results.Although some public-opinion analysis systems also apply the distributed architecture,the monitoring of cluster status can not be effectively managed.In this thesis,an emotion dictionary for network environment of colleges is constructed.Then,the emotion intensity is introduced into the TF-IDF algorithm to enhance the ability of negative text recognition.Finally,the distributed-based college public-opinions monitoring system is realized by combining with the technology of big data processing and the rule of calculation for emotional tendency.In this system,not only the characteristics of mass data storage and computing performance are considered,but also the effective management of distributed clusters are also considered.So the system can fully meet the needs of such applications.This system has been proved to be high available,high reliable and flexible in data storage,swift and accurate in terms of data analysis.This system mainly consists of import and export module,storage module,data preprocessing module,data analysis module and web application demonstration module.Among all,the import and export module uses Apache’s Sqoop tool for specific module development,and implements the functionality of transportation of result data betweenHDFS and Mysql.Storage module uses HDFS,MongoDB,and Mysql as base data storage unit while achieving parallel fetching of MongoDB database through MapReduce,and realizing the function of parallel data writing toward HDFS file system.Data preprocessing module makes use of Jieba,Stop-words features,updated TF-IDF,and MapReduce parallel computation architecture,achieving word separation as well as word generation vector of text data within documents written in Chinese.By combined with custom sentiment dictionary,rule of algorithm and traditional clustering algorithm,the data analysis module is able to achieve the functions of topic detection,sensitive topic discovery and sentiment orientation analysis.Finally,functional and performance tests are conducted on this system.It shows that the system is able to realize storage,processing,analyzing information retrieved from websites of universities’.The application of this system can provide technical support for public opinion supervision organizations.
Keywords/Search Tags:distributed platform, word segmentation, pre-process, public opinion supervision
PDF Full Text Request
Related items