Font Size: a A A

Based On SVM Text Mining Application In Logistics Company

Posted on:2013-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:R YongFull Text:PDF
GTID:2249330395984602Subject:Management information system
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology and systems engineering, driven by the tide of global information, data mining has become one of the activists in the study. Work which finds valuable knowledge modelfrom the data has become great significance in the research field. Support vector machine is a new method of machine learning.It isbetter than other ways to solve the nonlinear and high dimensionproblems. SVM has been applied in many fields, such as handwritten digit recognition, object recognition, text categorization. But the new approach also has problems which don’t to be solved, such as text classification is characterized by the number of samples, noise, and uneven number of samples in each category, that make the support vector machine for text classification has the shortcomings of the training speed and classification slower.In order to solve the problems that logistics company vehicles is difficult to control the intensity of use, and maintenance cannot be in time, this paper use driving records for text mining to monitor each time the transport task to the intensity of use of the vehicle, and use Information technology to do a meticulous management of vehicle maintenance. The driving record of mining uses Chinese word processing technology, and utilizesTF×IDFAlgorithm to calculateweighted for feature item, and then reports are represented to vector on vector space model. When reduce the dimension of feature item. In this paper, through experiments compare the four dimension reduction algorithms to find which algorithms are better to adapt the feature of driving reports. Finally decide IG and MI to be dimension reduction algorithms. Because standard support vector machine can only be used for single-label sample classification, this paper research tree structure of multi-class support vector machine classifier to solvesingle-class problem. In order to optimize the efficiency of the classifier, we take the utilization of clustering algorithm to construct balanced binary tree to improve the speed of the classification. In the system design, each module uses low-coupling way design, in order to be easy to monitor the text mining process in all aspects of the output value, and increase system flexibility, so that each module easy to maintain and modify.
Keywords/Search Tags:Support Vector Machine, Text Classification, Balanced BinaryTreeLow-Coupling
PDF Full Text Request
Related items