Today,with the rapid development of information technology,massive data is growing at a geometric multiple speed.How to filter out the needed information from massive data conveniently and efficiently has become one of the urgent problems to be solved.In order to deal with large-scale data better,a large number of Granular Support Vector Machine(GSVM)algorithms have been proposed,which can not only overcome the shortcomings of low learning efficiency of traditional support vector machines,but also achieve satisfactory generalization performance.However,these methods also have some shortcomings.They usually make a simple granular analysis of large-scale data,which makes the results easily disturbed by noise and outliers,thus ignoring the data containing important information,resulting in limited performance.At the same time,the problem of parameter optimization has not been introduced into the field of granular support vector machines.Based on this,this paper takes GSVM as the basic model and combines RPFKM(Robust Projected Fuzzy K-Means)clustering method.Firstly,the concept of spatial similarity is put forward from the level of dynamic granularity division,which refines the choice of important data.Then the improved grey wolf optimization algorithm is introduced to solve its parameter optimization problem.A large number of classification experiments on multiple data sets verify the effectiveness and universality of the proposed method.The main research results of this paper include the following two aspects:(1)Aiming at the dynamic granularity division method,a dynamic granularity support vector machine method based on spatial similarity is proposed: SDGSVM(spatial dynamic granular support vector machine).Firstly,the data set is divided into multiple classes by RPFKM clustering method,and each class is a hyperparticle.Then,the spatial similarity of each hyperparticle after division is calculated respectively,and the hyperparticles containing more information near the hyperplane are dynamically decomposed into multiple hyperparticles through continuous iteration;At the same time,a number of hyperparticles with low density far away from the hyperplane are deleted,so that the number of hyperparticles is always kept on a certain scale dynamically,thus maintaining the stability of the scale.This method fully considers the influence of complex data distribution on generalization ability,and improves the classification surface based on maximum interval.The experimental results show that the algorithm can effectively improve the learning efficiency of SVM while maintaining the accuracy of model training.(2)In order to optimize the parameters of granular support vector machine,an improved Grey Wolf Optimization Algorithm(IGWO)is proposed based on the Grey Wolf Optimizer,GWO).On this basis,combined with granular support vector machine model,a granular support vector machine model based on improved grey wolf optimization algorithm: IGWO-GSVM(improved grey wolf optimization granular support vector machine)is proposed.By combining Tent map with Lévy flight,and improving the convergence factor,the improved grey wolf optimization algorithm has better convergence performance,and it is easier to achieve a balance between global search and local optimization,so that the optimal solution of GSVM model parameters can be obtained faster and the learning efficiency can be improved.Experimental results show that the algorithm can quickly find the best parameters and improve the classification accuracy. |