| With rapid development of Internet and popularization of e-commerce, the online information emerge in a large number everyday, more and more users want the Internet to provide more personalized service, e. g., Businesses would like to get the user's comments or suggestions about the product in time so that they can improve product performance and after-sales service; Consumers want to obtain some guidances of their consumer behavior from the online comments or reviews, Government would like to acquire the people's attitudes to adjust the corresponding decisions, etc. All these urgent demands of applications make opinion mining come into the researchers'eyes, and become a hotspot in the field of information processing rapidly.This paper centers on opinion mining of the Web automobile reviews: starting with the levels of word, sentence and text respectively, carry out the study on sentiment oriention discrimating and evaluation object extraction, explore the new ideas and new techniques of opinion mining. The major works of this thesis include:The paper presents two different methods of using Probabilistic Latent Semantic Analysis (PLSA) to determine the word's sentiment orientation:1, After obtaining the similarity-matrix between each target word and basic word by using PLSA, determined every target word's polarity through a poll; 2, By making the use of PLSA to semantic cluster and extend the target words, then find the synonyms of each target word, at last, a words'polarity identification method based on synonyms is exploited to determine the target word's sentiment orientation. These two methods are both not subject to external resource constraints and would solve the data sparse problem in some extent. Firstly, the candidate evaluation objects are extracted according to the evaluation object recognition. By combining the word templates with part of speech templates, and preprocessing before giving the scores to the candidate evaluation objects to improve the recall and precision for extracting candidates evaluating objects. Secondly, from a small set of seeds of templates and evaluating objects, the evaluating objects are extracted by using bootstrapping learning method, furthermore, the evaluating objects are clustered using K-means clustering method for realizing the extracting of product name and product attribute.Making use of the theories and methods presented in this thesis and other existed techniques, construct an opinion mining system based on web automobile reviews. By using the web spider to update the background knowledge base regularly, mine opinions from the web automobile reviews in the levels of text, sentence and collocation respectively, finally give the total evaluation and some detailed evaluation about the specific automobile band questioned by the user. |