Font Size: a A A

Research On Network Event Text Clustering Of Food Safety Based On Storm

Posted on:2020-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:F Z XuFull Text:PDF
GTID:2381330602461440Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,food-related vicious incidents have occurred frequently,and food safety has gradually become a widespread problem and has grabbed the public's attention.At the same time,with the popularity of the Internet and the development of big data,food safety incidents spread rapidly on the Internet,and once it appears,it will cause heated discussion.Under such circumstances,the traditional food safety network public opinion monitoring system cannot meet the needs of today,and improving the accuracy and efficiency of the system is currently the main research direction.The main research contents of this paper include the improvement of text clustering algorithm based on single-pass,topic extraction based on Storm big data processing framework and exhibition of multi-dimensional food safety network event topic extraction results.Main tasks are as follows:1.For the problem that the threshold of the traditional single-pass clustering algorithm needs to be artificially set,the hill climbing algorithm is used to optimize the threshold value to obtain the optimal threshold.The error caused by setting the threshold is avoided,and the effectiveness of the clustering algorithm is also improved.2.For the traditional single-pass clustering algorithm,the number of clusters is not preset,which leads to the low efficiency of the algorithm,and the word list classification are improved.By classifying the cluster and each text,the frequency of similarity comparisons in the clustering calculation is reduced,which greatly reduces the calculation time and improves the accuracy of the algorithm.3.For the characteristics of streaming big data,the improved single-pass algorithm is deployed on the Storm framework for parallelization.At the same time,data inconsistency caused by algorithm parallelization is improved by repeatedly acquiring the cluster increment and random delay,thereby effectively easing this problem and improving the accuracy of the algorithm.Finally,the results of event topic extraction are multi-dimensionally displayed in this paper,so as to achieve early warning effect and have practical significance.
Keywords/Search Tags:food safety, Storm, text clustering, streaming data
PDF Full Text Request
Related items