Font Size: a A A

Food Safety Network Public Opinion Analysis Research And Monitoring System Based On Focused Crawler

Posted on:2017-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q WuFull Text:PDF
GTID:2311330491460884Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, "food security" problems are very common. It is serious harm to our health. In order to timely detect possible hot events from network news, especially the negative impact of food safety news, and help the government departments to keep abreast of events development trends and public opinion. So, this paper was carried out research and implement methods.The main contents of this paper can be divided into three parts:the improvement and implementation of focused crawler, public opinion analysis and topic extraction, and public opinion analysis system of food safety.In the first part, based on the basic principle and key technology of crawler technology and the combination, this paper proposed an improved focused crawler. It improved the traditional method of the web page content and processed a combination method based on the HTML code analysis and text density, and it can greatly improve the accuracy of text extraction. It also improved the VSM of text similarity calculation, and proposed a new similarity calculation method of VSM with multi-reference factors. At the same time, it also optimized the initial seed module and dynamic threshold module of focused crawler. And it also improved the URL sorting, file storage and multi-threading. Through the optimization of focused crawler, this paper realized a focused crawler for particular topic and it can improve the efficiency and accuracy of the focused crawler through the contrast experiments.In the public opinion analysis and topic extraction part, this paper chooses single-pass method by comparing advantages and disadvantages of several common clustering algorithms. And it proposed a time reference element and multi-layer single-pass clustering algorithm. This paper also processed the cluster center vector determination method. At last, through the contrast experiments, it can improve the improved method efficiency has been improved in clustering algorithm and topic extraction.Finally, this paper implements a food safety public opinion analysis system and it can detect the recent food safety hot events.
Keywords/Search Tags:focused crawler, text extraction, similarity calculation, topic extraction, single-pass clustering, public opinion analysis
PDF Full Text Request
Related items