Font Size: a A A

Research On Some Problems Of Data Processing Under Small Sample

Posted on:2009-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:W LiangFull Text:PDF
GTID:2120360245481379Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
The traditional statistics analysis bases on large sample data, and all kinds of estimate methods, such as, time series forecasting method, the artificial neural network etc. has theoretical assurance all just under the big sample. But in most actual circumstances, the sample number is usually very limited, even is few, thus a lot of methods are hard to obtain ideal result and many time series forecasting methods don't include the nonlinear factors, and the solution that artificial neural network gets into the local superior easily. These shortages limit these methods in actual application. Therefore, the small sample forecasting has been a difficult problem in statistics. The gray prediction theory is more adaptive to the small sample estimate problem compared with other methods, but gray estimate model examine grade is unqualified, namely: P≤0.7, C≥0.65, gray forecasting model can't be used. Support vector machine has been introduced for solving nonlinear function estimation problems. It is established based on the structural risk minimization principal rather than the minimized empirical error. Within this new approach the training problem is reformulated and represented in such a way so as to obtain a (convex) quadratic programming (QP) problem. The solution to this QP problem is global and unique, and it can well solve small sample, nonlinear, high dimension problems. A modified version of SVM for regression is called least squares support vector machine, namely, changed the restrictions from inequation to equation in the SVM method and made the error squares sum loss function as the empirical loss. In this way, we translate the problem into a linear matrix requesting problem. This method has the advantage to deal with small sample, complex algorithm and nonlinear and has nothing with the sample dimension.The main contributions of this paper are listed as follows:(1) The small sample forecasting has been a difficult problem in statistics for a long time. In the paper, we compare these methods by forecasting peak load and the incidence of infectious diseases, and we get the superior forecasting methods about peak load and the incidence of infectious diseases.(2) We apply LS-SVM method to the incidence of infectious diseases forecasting under small sample for the first time. By comparing with grey forecasting method, we get that the method is effective and advanced to forecast the incidence of infectious diseases.(3) Put forward a method using Particle Swarm Optimization algorithm to optimize the Grey Forecasting Model parameters and optimum input subset. By simulating and computing, we get an improved forecasting accuracy.
Keywords/Search Tags:Small sample, Data processing, Bootstrap, Least squares support vector machine, Particle swarm optimization
PDF Full Text Request
Related items