Font Size: a A A

Design And Implementation Of Catalogue-Analysis System Of User Reviews Based On Text Mining

Posted on:2016-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y S LiFull Text:PDF
GTID:2298330467991887Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There has been a notable increase of user comments after the emergence of web2.0, the so-called "Comments Age". People always refer to others’comments involuntarily when they are reading news, books, watching movies, listening to music, purchasing, almost doing anything. And they in turn generate new comments afterwards. Consequently, a new filed for exploring the Internet arises to utilize the magnanimity data of user comments, which rely heavily on the text mining technology since user comments are always unstructured data. As a result, a popular application emerges from the magnanimity user comments data to extract, classify and display the topic of comments.The principal contributions of the work presented in the thesis are:(1) Analyze the current utilization of user reviews/comments, summarize the data features of book reviews and propose the design and implementation of the Catalogue-Analysis System.(2) Summarize the differences and similarities of book reviews and goods reviews, design and carry out Comment Spam Filtering Module with filtering rules based on key words.(3) Design a new-words extraction method to extract candidate words from corpus, calculate the support and confidence, filter the new words by setting threshold and finally create the exclusive corpus dictionary for book reviews.(4) Design and implement the topic-determination method based on frequency statistics and discrete entropy of term collocations。(5) Apply the system to real users’ book reviews to examine the effectiveness of the system.
Keywords/Search Tags:book reviews text mining, comments, spamfiltering, new words discovery, topic extraction
PDF Full Text Request
Related items