Kernels Methods For Fast Forecasting

Posted on:2009-03-13

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W W He

Full Text:PDF

GTID:1100360245483081

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

With the development of economy, science and technology, the task of statistics becomes more complex, and the size of the date set involved is on the increase. To cope with these challenges, in this dissertation, we develop several fast kernel machines, which are successfully used in time series forecasting. Compared with existing methods, the resulting methods can obtain good performance at low computing cost. For each machine, we make a clear theoretical analysis and carry out a series of experiments to evaluate its performance.Improving the performance and efficiency of learning is the topic of this dissertation. In practice, the information is often distributed unevenly over date sets. Considering this, in Chapter 3, we develop an optimized SVR, namely MO-SVR, which performs parameters modifying, parameters optimizing and features selecting in an unified framework. All the parameters involved are optimized with GA, and just based on the optimized multiple kernel, feature selection is carried out to reduce the redundant information. Experimental results assess the feasibility of our approach and demonstrate that our method is a promising alternative for time series forecasting.Reducing the size of training set is a direct way to improve the efficiency of learning. So in following two chapters, we discuss two ways respectively to reduce the size of learning. Fist, we consider local learning and present it in Chapter 4. We present a general form for local kernel machines and derived a theoretical bound. Based on leave-one-out errors or bounds, pattern search method is used for model selecting. Intensive experiments on a real world electricity load forecasting have been carried out and the results demonstrate the feasibility of our methods of obtaining an improved generalization performance at a reduced computation cost. In Chapter 5, we consider another way to reduce the size of learning, namely imposing the sparsity of kernel regression machines. Different to existing methods, our method DS combines two steps, namely approximation and learning, and simplifies the learning problem directly in the primal space. The main advantage of out method lies in its ability to form very good approximations for kernel regression machines with a clear control on the computation complexity as well as the training time. Two algorithms, namely CF and CG are developed to realize the idea. Experiments on two real time series and benchmark Sunspot assess the feasibility of our method.As another popular way, kernel based online algorithms, which formulate the learning problem as a stochastic gradient descent in RKHS, are present in the last chapter. Existing algorithms are extended and the resulting algorithms provide more patterns for learning rate descent. Inspired by this, we develop a new way LSMD to adapt learning rates automatically, which uses stochastic meta-descent algorithm within a limited range to keep the algorithm stable. ILK algorithm adopting implicit update is also discussed here and the results show that, combining SMD with ILK is another promising way for online learning, which posses the inherent stability and adaptive ability as well. We demonstrate theoretically the superiority of implicit update over explicit update for online learning and derive some theoretic results about the convergence of the algorithms involved in this chapter. Experiments on a real time series and two benchmarks corroborate the theoretical results, and show that our algorithms outperform more primitive methods.Based on previous work, we present successfully several kernel methods for regression machines. Clearly, the resulting algorithms are not limit to use in time series forecasting. They can be used in other cases, and the idea can also be extended to non-kernel learning methods. Learning form data prompts the development of statistics, and in turn, the development of statistics provides more theoretic support for learning. We believe, in some sense, two things are one and our work will contribute something to it.

Keywords/Search Tags:

statistical learning theory, kernel machines, MO-SVR, DS, local learning, adaptive online learning

PDF Full Text Request

Related items

1	Learning On Riemannian Manifold: Online Classification And Multi-Kernel Algorithms
2	Kernel-Based Algorithms In Statistical Learning Theory
3	The Quantile Regression Learning And Related Problems
4	Investigation Of Some Problems On Regularization Algorithms In Learning Theory
5	Multivariate Statistical Process Monitoring Based On Local Feature Enhancement
6	Learning Rates Of Least-square Regularized Regression With Polynomial Kernel On A N-dimensional Simplex
7	Sparse Representation And Dictionary Learning For Hyperspectral Remote Sensing Image Classification
8	Analysis Of Learner's Behavior And Prediction Of Learning Risk Based On Statistical Learning Method
9	Research On Ocean Environment Data Prediction Based On Deep Learning
10	Research On Statistical Learning And Statistical Control Process For Matrix Data