Font Size: a A A

A Sample Selection For Reduced LSSVM Based On Complex Networks And Its Application

Posted on:2014-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:2230330398450215Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
With the development of technology, more and more data is collected stored in the database, and machine learning is used to excavate useful information hidden in the data. When the number of samples of the training set of machine learning becomes massive, the corresponding time and space complexities are very high. Reduced SVM is that the original SVM reduces the size of training set randomly to get less support vectors, so that it could decrease the complexity of model. Although the speed of computing is obviously increased, it has some disadvantages in solving many practical problems. Generalization performance and accuracy of the model decrease due to the random sample selection, because some useful information in the training set is lost inevitably.This paper proposes a sample selection based on complex networks for the reduced least squares support vector machine for regression. The model first constructs data samples to calculate the distance between the samples, and then gets the adjacency matrix and sample set representing a complex networks. In order to obtain the samples communities corresponding to the practical problems, the community detection is carried out to maximize modularity. Since outlier data will appear in the real problem, the community on behalf of the outliers is removed under the rules. Then the combination degree of each sample node is calculated, and the sample whose combination degree is larger is selected to retain the useful information as much as possible. A reduced least squares support vector machine model is created.To illustrate the effectiveness of the proposed method, it is compared with some other sample selection methods. It is indicated by the simulation of a Blast Furnace Gas system modeling that the proposed sample selection overcomes the randomness of ordinary clustering methods, removes outlier data and redundant samples, as well as improves the typicality of training set. And the established least squares support vector machine model is with high accuracy, low computational complexity and better generalization.
Keywords/Search Tags:Sample Selection, Least Squares Support Vector Machine, ComplexNetworks, Community Detection, Clustering
PDF Full Text Request
Related items