Distributed Regression Learning With Dependent Samples

Posted on:2021-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Zheng

Full Text:PDF

GTID:2370330605460667

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Nowadays,big data learning is rising day by day,and the acquisition and updating of data information have become extremely rapid,so how to deal with data efficiently has become the research hotspot of learning theory.Distributed learning provides an efficient and convenient new way for big data learning because of its parallel computing and privacy protection.The basic idea of distributed learning is to divide a large data set?z _i?^N_i?28?1into m disjoint parts in an equal or unequal way and store each part separately to m data processing units.Each part is analyzed in parallel and then the processing results are aggregated together.Based on the advantages of distributed learning,this paper will mainly study the distributed algorithm based on dependent samples and the regularization least-square algorithm based on streaming data from the aspect of algorithm theory.For the distributed algorithm based on dependent samples,we use the integral operator method and a new error decomposition method to prove that when the sample sequence satisfies the?-mixing condition and?-mixing coefficient satisfies the polynomial decay,the error bound and the progressive learning rates is derived.For the regularized least squares algorithm of block-wise-data,we use the leave-one-out method and the integral operator method to prove that when the sample set is equally divided or satisfies the polynomial increasing condition,by adaptively adjusting the regularization parameters.The optimal learning rate can be achieved.The main contents of this paper are as follows:The first chapter mainly introduces the development course and theoretical framework of statistical learning theory.The second chapter first introduces the relevant theory and research status of regularization algorithm,including regularization least square algorithm and coefficient regularization algorithm;then introduces the related theory and research status of distributed learning.The third chapter introduces the first algorithm studied in this paper:distributed algorithm based on?-mixing conditions.We use the integral operator method,error decomposition and the basic characteristics of?-mixing conditions to obtain the error bounds and learning rates of the algorithm.The fourth chapter we introduce the second algorithm:kernel ridge regression algorithm based on block-wise data.We prove that the optimal learning rate can be obtained by adjusting the regularization parameters adaptively when the sample set is equal or polynomial increasing.The fifth chapter summarizes the main research results of this paper and puts forward the next work plan.

Keywords/Search Tags:

kernel ridge regression, distributed learning, dependent sampling, block-wise streaming data, adaptive under-regularization

PDF Full Text Request

Related items

1	Kernel-Based Algorithms In Statistical Learning Theory
2	Automatic Measurement Of Celestial Spectral Parameters Based On Kernel Ridge Regression Method
3	ERM And Least Square Regularization Learning With Unbounded Sampling
4	Learning Rates Of Least-square Regularized Regression With Polynomial Kernel On A N-dimensional Simplex
5	Variable Selection Problems Using Bayesian Method And Graph-constrained Regularization For Analysis Of High-dimensional Genomic Data
6	Research On Optimal Quantile Regression Based On Variable Gaussian Kernel Function
7	Distributed Stream Processing Of Big Spatial Data
8	The Comparative Study Of Blockwise Causality
9	Multi-kernel Regularization Learning And Correlative Problems
10	Research On Key Technologies Of Distributed Streaming Algorithms Based On Graph Data