| With the advent of the era of big data,new applications continue to emerge,and the data exhibits high-dimensional,nonlinear and complex properties.It becomes necessary to effectively perform feature extraction and feature selection.This work has become a part of machine learning and data mining.and pattern recognition research hotspots.Traditional data processing methods are often ineffective when dealing with these high-dimensional data,and cannot mine useful information hidden in the data.Therefore,mining the useful information hidden in big data has become one of the difficult problems in the era of big data.Feature selection is the basic work in big data mining,and the design of the classifier is the most difficult part of the whole model training.Feature selection uses a certain search strategy to search the feature space for the best feature subset that is conducive to model training,while classification uses the searched best feature subset to train a robust classifier,and then use the trained classifier.to make accurate predictions on unknown datasets.In the field of machine learning,there are many classification models.This paper selects two dominant models,namely Bayesian theoretical model and support vector machine,for analysis and research.In Naive Bayesian theory there is an assumption that each instance is independent of each other with respect to a given class.According to the independence between data attributes,each attribute can be estimated separately,making it suitable for multi-attribute classification problems.The penalty parameter C of the support vector machine and the RBF kernel parameterσ are the key parameters that affect the classification performance,so optimizing these two parameters can effectively improve the classification performance of the support vector machine.The above work mainly adopts intelligent optimization algorithm to solve the problem of feature selection and classifier parameter optimization.In this paper,the particle swarm optimization algorithm is firstly improved,and then the improved particle swarm optimization algorithm is used to study the optimization problem of the naive Bayesian and support vector machine classifiers.In the process of optimizing the Naive Bayes classifier,the improved particle swarm algorithm is used to select an optimal attribute subset from the entire attribute space,and the Naive Bayesian is constructed through the selected optimal attribute subset.Classifier.In the process of classifying and optimizing the support vector machine,the penalty parameter C and the kernel parameter σ are integrated into the individual binary code,and the improved particle swarm algorithm is used to find the optimal parameter combination in the experiment process to better optimize the support vector machine.The classifier is trained.The specific contributions of this paper include:1.In view of the inherent shortcomings of traditional particle swarm optimization,such as premature convergence and parameter dependence on professional experience,the traditional particle swarm optimization method is improved by combining two optimization strategies of multi-swarm and acceleration coefficient adaptive control,and a dynamic control method is proposed.Multi-Colony’s Particle Swarm Optimization for Dynamically Controlling Inertial Weights(MPSO_DCIW).2.The Naive Bayes classifier is not suitable for practical applications due to the existence of the Naive Bayes assumption.The swarm intelligence algorithm is used to select the optimal feature subset for the data,and then the selected optimal feature subset is used to construct a Naive Bayes classifier,which effectively avoids the limitations of the Naive Bayes assumption.This paper proposes a naive Bayes classifier optimization scheme based on improved particle swarm optimization,and compares the performance with a variety of commonly used machine learning algorithms(SVM,KNN,etc.).3.Since the penalty parameter C of the support vector machine and the RBF kernel parameter σ directly affect the classification performance,this paper proposes a parameter optimization scheme of the support vector machine based on the improved particle swarm optimization for this defect of the support vector machine. |