| In the era of Internet,the value of massive data is becoming more and more huge.At the same time,it is also faced with the difficulty of removing irrelevant and redundant features from massive data.As an effective data processing technology,feature selection algorithm has played an increasingly important role in today's society.The selection of feature selection algorithm generally considers the following factors: classifier performance,whether to remove irrelevant features,whether to remove redundant features and the size of data set.However,if only a specific algorithm is used for feature selection,only a single factor will be pursued.For example,using wrapper model can significantly improve the performance of learning algorithm,and using relief series The algorithm can remove irrelevant features and so on.Therefore,in order to take into account the above factors as much as possible,the advantages of different selection algorithms are combined to improve the performance of the algorithm,making the feature selection algorithm make more contributions in the field of data.In this paper,according to this problem,the feature selection algorithm of fusion is proposed,and the specific algorithm is improved.In this paper,a new two-stage feature selection method is proposed by combining the maximum information coefficient with the improved harmony algorithm MIC-MHS.In the first stage,the method uses the theoretical knowledge of the maximum information coefficient to remove the irrelevant features,and allocates the probability of each feature being selected according to the information in the first stage.Because there will be redundant features in the initial reduction subset,it needs to further search the feature subset;in the second stage,the population of harmony algorithm is initialized according to the feature selection probability to provide the harmony search algorithm with The feature selection probability is dynamically adjusted with the number of iterations.The new harmony is generated according to the feature selection probability,and the feature correlation and feature subset dimension are taken as the objective function in the harmony algorithm.Experiments on UCI datasets show that the algorithm can get as small a subset as possible,and can get high classification accuracy.In order to make the proposed algorithm mic-mhs more widely used,this paper implements a feature selection algorithm system,which uses the springboot framework,the mybatis framework as the data persistence layer,and the front page uses the bootstrap framework and the thymeleaf template engine.Through this system,parameters are set,various feature selection algorithms are run and the results are compared.The innovation of this paper is as follows:(1)in the process of population initialization of harmony algorithm,a new initialization method is proposed.In the first stage,the maximum information coefficient is used to measure the correlation between features and categories,which is not only used to remove irrelevant features,but also to provide prior knowledge for the second stage harmony algorithm;(2)in the second stage harmony search algorithm,in the process of generating new individuals In order to search for potential feature subsets,the strategy of fine-tuning the feature selection probability is adopted in the process of fine-tuning.(3)in the design and acoustic calculation,fine-tuning the feature selection probability is adopted In the fitness function of the method,not only the feature correlation,feature subset dimension,but also the redundancy between features are considered. |