Font Size: a A A

Application On Text Classifying With SVM

Posted on:2007-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z G YeFull Text:PDF
GTID:2178360185966984Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Network information increases rapidly with the development of Internet. In order to make the information service more efficient and precise, it is important to make the information in Internet organized and categorized reasonably. The text focuses on processing text information in the network and proceedes the research on text categorization from two levels: theory and application.Firstly, the text analyzes the total model of text categorization, including the information preprocessing, feature representation and feature catching. The author analyzes technologies of feature representation, feature catching and text categorization algorithm especially.Secondly, the text studies the Statistical Learning Theory (SLT) and Support Vector Machine (SVM) theory seriously, discusses training, categorizing and multi-category classification algorithm and kernel function. the author shows the research and application status of Support Vector Marchine, and points out some important issues.Finally, The text analyzes a document categorization model based on SVM. This model gets the text features model by calculating the mutual information of words and types. Then intelligent Chinese word segmentation system based on syntax understanding helps the author get the TF-IDF description in VSM of the testing document. The word similarity is taken to weight the document vector features. After being translated to the vectors, the training documents are learned by the SVM and the support vector is got to categorize. Then the author can categorize the testing documents after translating the documents to vector features. Based on the model, the author discusses the kernel function choice and the penalty parameter C determination through the examples of multi-category classification in SVM, confirms the conclusions by the experiments.
Keywords/Search Tags:text categorization, SLT, SVM, multi-category classification
PDF Full Text Request
Related items