Font Size: a A A

Learning On Riemannian Manifold: Online Classification And Multi-Kernel Algorithms

Posted on:2008-03-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:G B YeFull Text:PDF
GTID:1100360215484224Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the development of modern science and technology, we are confronted with a mass of data every day. How to understand and deal with these data and then make use of them become an important research topic in modern science and technology.Machine learning is a subject developing rapidly since computers are widely used. Its main object is to use the empirical data to design some effective algorithms and enable computers to have learning ability, that is, we want computers to learn some laws from empirical data. Learning theory is a branch of machine learning and focuses on searching for theoretical foundations for machine learning algorithms and then designing some more effective algorithms. In this thesis we focus on two kinds of algorithms—online classification algorithms and multi-kernel algorithms. In particular, our analysis is adaptable to manifold learning (i.e. study of function relations or structures on data lying in manifolds embedded in huge dimensional Euclidean spaces) and the Euclidean space setting is only a special case.Firstly, we consider fully online learning algorithms for classification generated from Tikhonov regularization schemes associated with general convex loss functions and reproducing kernel Hilbert spaces. For such a fully online algorithm, the regularization parameter in each learning step changes and it is more feasible in practice. But this raises an essential difference from the partially online algorithm which uses a fixed regularization parameter. We first present a novel approach to the drift error incurred by the change of the regularization parameter. Then we estimate the error of the learning process for the strong approximation in the reproducing kernel Hilbert space. Finally, learning rates are derived from decays of the regularization error. The convexity of the loss function plays an important role in our analysis. Concrete learning rates are given for the hinge loss and the support vector machine q-norm loss. our results give a solid theoretical basis for online algorithm and it is the first complete theoretical analysis on fully online classification algorithms besides the least-square regression algorithm. It not only raises some theoretical problems on learning theory, but also provides theoretical guideline to design more effective algorithms.Manifold learning has been a hot topic in learning theory since 2000. Its key point is to assume that function relations or structures of data lie on manifolds embedded in huge dimensional Euclidean spaces. In general, the dimension of the manifold is much less than that of the underlying Euclidean space. There are hundreds of papers on manifold learning, but most of them are cut-and-try, that is, they want to explain how manifold structure makes effects on algorithms by experiments through data. But the real scientific goal should be the desire to show that the manifold structure can greatly improve the learning rates and deduce the complexity of the algorithm. Until now there are not more than ten papers that have strict mathematical analysis targeting this goal. How to give a solid mathematical analysis on the influence of the manifold structure on algorithms becomes an open, difficult and valuable research topic.In this thesis we study the approximation and learning by Gaussians of functions defined on a d-dimensional connected compact C∞Riemannian submanifold of IRn which is isometrically imbedded. We show that the convolution with the Gaussian kernel with varianceσprovides the uniform approximation order of O(σs) when the approximated function is Lipschitz s∈(0,1] and that of O(σ2) under LP(X) norm when the approximated function is in the Sobolev space H2p(X). The uniform normal neighborhoods of a compact Riemannian manifold play a central role in deriving the approximation order. This approximation result is used to investigate the regression and classification learning algorithms generated by the multi-kernel regularization scheme associated with Gaussian kernels with flexible variances. When the manifold dimension d is smaller than the dimension n of the underlying Euclidean space, our learning rate is much better compared with those in the literature. By comparing approximation orders, we also show the essential difference between approximation schemes with flexible variances and those with a single variance. It verifies the efficiency of the multi-kernel algorithms in practice. Our mathematical analysis is one of very few theoretical results in manifold learning and shows the essential effect of the dimension of the manifold on multi-kernel algorithms. It owes its own great theoretical and applicable foreground.
Keywords/Search Tags:Learning theory, Error analysis, Reproducing kernel Hilbert spaces, Classification algorithm, Regression algorithm, Online learning, Gaussian kernels, Approximation, Riemannian manifolds
PDF Full Text Request
Related items