| Support vector machine algorithm is one of the common machine learning algorithms,which is widely used in the regression problem of data processing.In this dissertation,,some problems are analyzed based on the support vector regression machine,and a new data preprocessing method is proposed.With the advent of the era of big data,although the massive data improves the accuracy of the algorithm model,it also brings the problem of too much computation and memory requirements,which limits the application of SVM.At the same time,with the increase of data volume and the improvement of model accuracy requirements,support vector machine algorithm also produces problems such as poor generalization ability of the model and difficult to control the fitting degree.Therefore,it is very important to preprocess the data first.When using SVM algorithm in practice,the Density-Based Spatial Clustering of Applications with Noise(DBSCAN)algorithm is a commonly used data preprocessing algorithm.Based on DBSCAN algorithm and convolution,this dissertation proposes a new preprocessing algorithm.This algorithm can construct a new sample set with reduced sample number and feature dimension.The new sample set not only retains the information of original data,but also improves the distribution of samples.Therefore,using our new preprocessing algorithm to process data can reduce the memory required by SVR algorithm,and enhance the generalization performance of regression function.This dissertation describes the rationality and feasibility of our new preprocessing algorithm in detail,and compares our new algorithm with other commonly used support vector machine preprocessing algorithms. |